DOCUMENT RESUME 



ED 061 784 



FL 002 937 



AUTHOR 

TITLE 

INSTITUTION 
SP0N5 AGENCY 
REPORT NO 
PUB DATE 
NOTE 

EDRS PRICE 
DESCRIPTORS 



Lehmann, Winifred P. } Stachowitz, Rolf 
Feasibility Study on Fully Automatic High Quality 
Translation: Volume II. Final Technical Report, 

Texas Univ. , Austin. Linguistics Research Center, 
Rome Air Development Center, Griffiss AFB, N, Y. 
RADC-TR- 7 1-2 9 5 
Dec 7 1 
26 3 p. 

MF— $0, 65 HC— $9,87 

^Computational Linguistics; ^computers; Computer 
science* Data Processing; ^Descriptive Linguistics; 
Dictionaries; Information storage; ^Language 
Research; Linguistic Theory; ^Machine Translation; 
Semantics; Syntax; Transformation Generative Grammar; 
Transformation Theory (Language) 

ABSTRACT 

This second volume of a two-volume report on a fully 
automatic high guality translation (FAHQT) contains relevant papers 
contributed by specialists on the topic of machine translation. The 
papers presented here cover such topics as syntactical analysis in 
transformational grammar and in machine translation, lexical features 
in translation and paraphrasing, requirements for machine 
translation, current status of hardware and software as it affects 
FAHQT, bilingual computer dictionaries, and the shape of the 
diet ionary for machine translation, volume 1 (FL 002 936) includes 
papers as well as specific consideration of the FAHQT inquiry. 




ED 061784 




F^ASIBIilTY STUDY ON FULLY AUTOMATIC HIGH QUALITY TRANSLATION 

University of Texas 






■ ::f=. * * .!* •: f> r i-r »r - -«*/ ;V<> i>.y if 

••• v-' * .• ,-r v . ; 

■ ; VU-; 1 : -•>••• VrV -vi- - - 

v y- -• '.V f ; . = 

: v ■ ■ •’:% ' • * 






RADC-TR-71 -295 • Vol 
Final Technical Report 
December 1971 



II 



; ' • - r V-:.. « ■ r * > 



:■■■:' vV: V'V 



'• * - ' 

■ • • ■ .• • , : ", - • . . , ••:•••; . ; • •• • • •• V •• 

■ yvwy -r- ■. •• . . - , 







When US Government drawings, specifications, or other data are used for any purpose other 
than a definitely related government procurement operation, the government thereby incurs 
no f*®P?nsibility nor any obligation whatsoever; and the fact that the government may have 
formulated , fornished, or in any vway supplied the said drawings^ specifications, or other 
data is hot to be ^ regarded , by itnj^ication or otherwise, as in any manner licensing the 
kcdtter or ^ny other person or corporation; for conveying My rights ^ 
fac ture , use , or sell any paten ted in yen tion that may in any way related there to. 



i j 




























FEASIBILITY STUDY ON FULLY AUTOMATIC HIGH QUALITY TRANSLATION 



Dr, Winifred P, Lehmann 
Dr, Rolf Stachowitz 

University of Texas 



is. 




Approved for public release; 
distribution unlimited. 







Syntactic Analysts for "T ransformaticnal Grammars 



by 



Sm R . Patrick 



IBM T. J. Watson Research Canter 





If on. wishes to obtain a syntactic ana.ysis a.gorithm fo „ some class 

9 na ^nnars ? it is, of course, essential to 
co™,et al . characterize that ciass of grammars 

ipletely and Precise, y. r we mere.y tie down those details for which there 

aTClT I!n9U,St ‘° j ~° n ^ ~ - — aspects of 

ry Whl ch have not been so thoroughiy worked out, it may be . 
Possible to propose small sets of rul^ 

, . generatively account for certain 

mguistic phenomena but it is more difficult to o' 

to the problem of syntactic an l • meaningful consideration 

Qe , , analysis. It is iike.y that the existence of any 

ganana, a.gorithm fon syntactic ana.ysis depends upon the specification of the 
hco Plata aspects of the model in ^ . shouid be noted in this ragand 

centl dr"™ 1 ^ 3 ~ — -p in making 

certain decisions with resoert i-n i-u a 

construction of a linguistic theory which 

would otherwise be arbitrary. 

As is well known, transformational theory has b»» , 

f „_ y has been changing rapidly 

from its inception U p to the present time Ther • • 

. ■■■•■; h sre is disagreement as to the 

basic mechanisms that should be . allowed V© a 

. . . . 9 * * conventions on transformational "V"\ 

app icabili ty, allowable structures pri^iiti ^ 

th. ^ f ^ ’ P ^ transformat ions, etc. ) and as to 








descriptive practice. After reading this paper the reader can judge for 
himself the extent to which I have met the latter requirement while at the 
same time satisfying the former requirement. 

There are several alternatives to the course I have chosen that are 
open to anyone wishing to work on "transformational syntactic analysis", 
First, he can talk about the theoretical requirements of transformational 



analysis without actually working out the complete details of an analysis 
algorithm for any class of grammars. Such work can be valuable, especially 
if it contributes to our knowledge of the precise nature of transformational 
rules and conventions. The more assumptions we can build in, and consequently 
the tighter we can make our model without impairing its facility to describe 
language, the better that model is, and the closer w© are to saying something 
about a discovery procedure. 

A second alternative (fallowed by the MITRE Corporation < 1>) is to 
seek an analysis algorithm for a particular grammar rather than for a class 
of grammars. There are several objections I would raise to such a course of 
action. First of all, even though linguists tend to be quite tentative and cautious 
about the properties and details of a ; class of grammars- they propose as models 
of natural languages, they are certainly even more tentative about endorsing 

the. likelihood that the particular : fragmentary grammars they produce will 

stand up with the passage of time. My second objection to the consideration 
of particular grammars concerns the difficulty of producing an analysis proce- 
dure for a particular grammar. While it would 



one needs for a proof from a particular grammar without in the process 
specifying a class of grammars that have those properties. The situation is 
not unlike that in mathematics where a generalization is often easier to prove 
than a more restricted result, I had an excellent professor in Theory of 
Functions (William Ted Martin) who delighted in providing such examples. 



Whenever we would bog down in obtaining a needed proof we would hear his 
familiar advice to "Ask for more when the required result is too specific. " 

A third alternative which has been followed by most people who 
characterize their work in syntactic analysis as "transformational " in nature 
is to define a "transformational-like" linguistic theory based upon some algo- 
rithm for syntactic analysis, not upon the usual generative transformational 
apparatus. The deep structures produced by such programs often appear to 
be very close to those that are assigned to the same sentences by current 
transformational grammars, and the rules, which are variously called "trans- 
formations" or "inverse transformations" or sometimes just "rewriting rules", 
often bear names and functions similar to the transformations of generative 
transformational grammar theory. Efforts I would classify as being of this 



type have 



ive been undertaken by Kay <2> , Simmons < 3 >, Moyne <4>, Thorne <5>, 

~ ; •• ’ ■■ / •. ■ T ' '-‘v. V '"'V V-' :: v . ^ ‘ \ • ‘ “ - '• T •' "• '• : " \ T ‘T \ - •' • y • >.r: 

r T- .V-.' ' -r T-.T'V'. '■••• " V-/ V- : -T v : ?. T’. - r.- v ■■ ■ ' ' : 

y J i ^ _ 



natural language preceding projects relative to existing pansens ton genenative 

•: •.••• ••• ’ • -= - ••>••••• -v • . • - r • • • t-yv •- - - - • • y ^y yy - :' J . .• " v y ‘ ' - • --v: ■ ;-•••• - ' • :• • 

transformational grammars, y Surprisingly, relatively little is made of thi- ^ - 

: : ;-y .. ' yT;:\v-y ' . ■ . Ty> 

•• , . • w ■ ./ t; -y -■-:■■■• ■:-:y ^ vyy; -: 1 nt;-V. ry :iy Tv. ■■ ~ : ■ ;;t, ' • ; 

the proponents of.these systems.. The argument, often given, on the. other! 









in my opinion. The most important thing to note is that whatever the merits 
or shortcomings of such systems, they are linguistic theories which are 
unrelated to generative transformational grammar theory and as such their 



proponents face the task of independently establishing their adequacy for 



linguistic descriptive and explanatory purposes - their capacity for expressing 
significant linguistic generalizations. Unfortuantely, many of the proponents 
of such systems have not given enough attention to this task, basing the justi- 
fication of their linguistic theories not upon their ability to account for specific, 
linguistic data, but rather upon their tenuous relationship to generative trans- 
formational theory. 



Rather than discussing these alternatives further I will instead discuss 



my own work on transformational syntactic analysis. L_et us begin by sketching 
briefly the transformational analysis algorithm of my thesis. 

The model of transformational theory in question is roughly that which 
was in vogue prior to Aspects of the Theory of Syntax <11>. The base com- 
ponent is a context-free grammar with certain restrictions on recursiveness 



and sentence embedding. T ransformational applicability is specified by a 



structural index . This structural index is satisfied by a proper anal ysis which 

is a sequence of subtrees that constitutes a single cut: through a treey The 

modification to a tree 



For our s im pie mode i 




for generating a given string. This reverse procedure makes use of inverse 
transformations, which are mechanically computed — one for each generative 
transformation. In analyzing a sentence S, inverse transformations are 
applied in the reverse order of that in which the corresponding generative 
transformations are applied in deriving S. 

To understand inverse transformations let us examine their generative 
counterparts. We observe first of all that the structural change of a trans- 



formation references a sequence of nodes that occur in the structural index, • : 

interspersed possibly with additional morphemes. We call this sequence the 
inverse structural Index of the transformation in question . The structural 
change of course gives more information than is contained in the inverse 
structural index, but the latter provides the basis for our analysis algorithm. 

To give an example of an inverse structural index, let us consider a 
passive transformation whose structural index is (NP AUX V X NP X 
PASS) and whose structural change is (5 2 (BE EN 3) 4 0 1). :.The 

inverse structural index is (NP AUX BE EN V X X BV NP) because 5 : 






? denotes NP, 2 denotes AUX, etc. . . = 

For a transformation to be applicable to a tree T there must be a 
! proper analysis of T that satisfies the structural index of, that transformation. 










We make use of this fact in the following way. If a string of morphemes 
s; is the terminal string of a tree produced by the action of a transformation t_, 
then it must be possible to segment s such that the_ith segment can be analyzed 
as the ith node of the inverse structural index of t with respect to a context- 
free grammar consisting of the original base component rules augmented by 
rules reflecting structure that can be produced transformationally. It is 
possible to mechanically derive an augmented context— free grammar that includes 
all rules reflecting structure that might be formed in the transformational deri- 
vation of a given sentence. Hence, we have a necessary test that a given string 
of morphemes _s was produced by a transformation t_. 

Sufficient information is given in a transformation to permit the com- 



putation of a function we will call the corresponding; inverse transformation, 



This function maps a sequence of trees satisfying the inverse structural index 



of some transformation into a sequence of trees satisfying the structural index 
of that transformation . More precisely, if a transformation _t performed on a 
tree T yields a tree T 1 , we denote by P* the proper analysis of T f that satisfies 
the inverse structural index of t_; Now the inverse transformation t_’ corre- 
sponding to _t maps the proper analysis P ? into a sequence of trees whose 
debracketi nation is the terminal string of T-. For. the previously considered 




inverse structural index (NP AUX BET;EN. V X X BV NP) and the inverse 



Let us now see how an analysis procedure can be based upon our 



inverse transformations. We take up the analysis of a sentence S with respect 
'to a given context-free grammar G and an ordered set of transformations t ( , 

• » *t*. To simplify our exposition we begin with a grammar containing no 
binary transformations (i , e , ? transformations are not applied in a cyclic fashion)* 
Using one of the methods given by Patrick <10> we determine an aug- 
mented context-free grammar G f that contains rules reflecting all structure 
that can be produced in the derivation of S, In reversing the generative 
procedure.we first consider J: n . If J: n was performed in deriving S , it must 
be possible to segment 3 such that with respect to G f , the ith segment has an 
analysis as a tree dominated by the ith term of the inverse structural index of 
* If such a segmentation is possible, and if_t n ? is performed on the sequence 
of trees provided by this segmentation, then the de bracket! zati on S f of the 
resulting sequence of trees must be the terminal string of the tree that existed 
- just; before application of ^,y* (Complete debracketi^ation turns out to be 
unnecessary. Repeated debracketi nation of outermost structure until no 
derived constituent structure remains is all that is required*) If the analysis 
of S and S f (if it exists) are separately considered with respect to the original 

nations^ y_t*, • » . 9 Jt n -i , then the problem consists 



of one or more instances of essentially the original problerri of analyzing S 
with respect to _t 4 , ,,,, . If we carry out this procedure for each of the 










(S,S f ,., . .). Further reversing the generative procedure, it remains only to 
m determine whichv elements^ of thistset are anal viable as the vmhol r v ^ ^ 









with respect to G, Every deep structure of S with respect to the given trans- 
formational grammar must be included in the set of trees thus obtained. With 
each tree it is also possible to associate the sequence of transformations used 
in obtaining it. 



The analysis procedure becomes more complicated when binary trans- 
formations are included, as would be expected. Performance of an inverse 
binary transformation must always insert two occurrences of the sentence 
boundary symbol SENTB. The sequence of trees lying between these two 
SENTB markers corresponds to the constituent sentence, and the sequence of 
trees lying outside these two markers corresponds to the matrix sentence. Let 
us call the debracketi^ation of the former sequence the constituent sentence 
continuation j and let us call the debracketi nation of the latter sequence, with 
the symbol COMP inserted to divide the left and righthand sections, the matrix 
s entence continuation . 

It is clear that the constituent sentence continuation could arise from i; L'i 
repeated application of the transformational cycle. Hence, the problem of 
determining the underlying deep structure of this derived string is another 
instance of the original problem. In other words, inverse transformations 
rnus | he applied to the constituent sentence continuation in reverse generative 
order, as we already have discussed. If ^eventually, no binary transformation 
applies on some inverse cycle, the recursion terminates; the structure thus 
found is of course dominated by the sentence symbol SI, and in the complete 
structural description of the aiven sentence this Sl-dnmins^d ft-oo icr 



The generative transformational cycle works In such a way that no 
singulary transformations apply to the matrix sentence structure before a 
binary transformation has been applied to it and the constituent structure it 



I \ 



F, 



dominates. Hence, the matrix sentence continuation resulting from an inverse 
binary transformation need not be subjected to the entire inverse transforma- 
tional cycle. More than one embedded constituent sentence structure can be 
dominated by a single matrix sentence structure, however (as, for example, 
when both subject and object contain relative clauses), so the matrix sentence 
continuation must be subjected to repeated applications of the same or other 
binary transformations. The resulting matrix sentence continuation must 
finally be analyzed with respect to the base component G to see if an analysis 
as an SI is possible. Every underlying structure assigned by the given gram- 
mar to S must be included in the set of structures thus obtained. 

A brief reflection on why we begin the analysis of a sentence by applying 
inverse singulary transformations i& injorder:|;- Although it ds true that singu- 



lary transformations precede binary transformations in a given cycle , when the 



| last : binary transformation has been per formed for the last time it is still 

| /; VFF v' ; F .V: ;• -.v, ^ v- r F F ; ' ” 






f:v -v . 



, possible for the singulary rules to apply to the result of this final embedding. 

s ••• * - * <• j ■ . » . . ( = 

| Once the last singulary has, been, applied, .however, generation is complete . 

r . . .. ■■ • v i-: v- : ; 










sponding .inverse is to be applied in. recognition. ., :• : r. ...... 

’ , * ; ‘ f . v. - ’•* 1 •• ' ‘ - -* * 

, .. ■: ■ , , • • , f. : f :.f- --s'.-; .. 

/ “ ‘ ’ Ai,we have alreadv/;observ^d>±h^ have describe 



; F 




to a sentence. It is possible, however, that one or more spurious structures 
will also be found. There are several sources of slack in our procedure that 
could cause such a situation to occur. One of these is related to our use of 



so-called "auxiliary" phrase structure rules, which reflect structure that 
can be transformationally derived. These rules are required in order to 
ensure the application of every inverse transformation necessary to reverse 
the generative derivation. The use of these rules, however, raises the possi- 
bility that an inverse transformation will be applied at some point where It should 
not apply, from the point of view of reversing a valid generative derivation. 

If the continuation resulting from this wrong application of an inverse trans- 
formation is not subsequently blocked, an invalid underlying structure may 
result. 

Three other sources of incorrect "structural descriptions" are possible. 
All can be eliminated by suitable modification of the bast© procedure we have 
presented. The first deals with the use of obligatory trMfisformations. The 
procedure we have described finds all underlying structural descriptions of 
a sentence with respect to a grammar in which all the transformations are taken 
to be optional . It also yields all correct structural descriptions for a grammar 



tn which only some of the transformations are obligatory, but it may in addition 



give erroneous structures. 

The second source of unwanted structures is related to suDdeme^aw' 



The second source of unwanted structures is related to supplementary 

< ^ ' < ’ , - ~ ^ ' 

conditions that may be imposed on the applicability of a transformation. For 




] s ., . . .... r .... . „ . 



The third source of error is related to trees that are reduplicated by 
a transformation. In recognition it is of course necessary to ensure that two 



trees are Identical. If would be easy enough to mark such transformations 
and make the necessary tests for equality, thus eliminating this source of 
incorrect structures. 

Incorrect structures arising from any source may be discarded during 
the final synthesis phase . In this phase, structures produced by the analysis 
process are viewed as instructions to produce sentences. The base tree and 



the list of transformations constitute commands defining the appropriate phrase 
structure and transformational rules to apply to generate a sentence. The 
resultant string is compared with the original input string. The structural 
descriptions that yield strings matching the input string are those that constitute 
structural descriptions of the input string s* All other structural descriptions, 
which yield either nonmatching strings or no strings at all, are discarded. In 
practice, bogus recognitions are rare. In theory, their possible occurrence 
renders synthesis a necessary part of an effective recognition procedure of 



■theytypeywe^ have': sketch© 

* - >• - - - * - ' - 



A thorough understanding of this analysis algorithm requires; rnore;, • •' y" 



' ■ ‘ f # “ * ’ * . . - ' ' . ‘ ‘ 



precise definitions and some concrete examples • The reader is referred to 

■■ • • ...... ■.:■■■■ •. ' . 

. ' „ : ■ . . . ; : 

references <10>, <ia>, and <’~ * “ 







applicability, (3) stateable additional conditions of transformational appli- 
cability, (4) Chomsky adjunction, (5) the use of coordination— reduction 
rule schemata, (6) p re cyclic and post cyclic transformational components, 
and (7) obligatory as well as optional transformations. 



Considering these extensions individually, the addition of complex 
symbols presents several problems. First, I have not faced the problem of 
lexical selection and its inverse so I have assumed that input sentences consist 
of strings of feature bundles. Second, I have restricted features to lexical 
items in the manner of Aspects <1 1>. The more general case of allowing 
features to be associated with nonterminal nodes has been considered, but 
there remain unsolved problems of deriving the features to be associated with 
the nonterminals of transformationally derived structure. Finally, certain 
feature— sensitive rules can give rise to nondeterministic inverse transformations , 



For example, if a transformation of the type [+A] [-B] is used, it is inde- 

terminate whether the reverse transformation should leave — B as it is, change it to 
+B, or delete it entirely. Separate continuatiorts nesLilting fr'om all three possiF-'v?'";' 
bilities must be followed, and a rule of the type 



SWSliMlii 



+A2 






.... 

+A3 













change to the analysis algorithm. In the old algorithm, as we sketched it, 
intermediate derivational stages were not completely known; only proper 
analyses of intermediate trees were required and found. As an alternative 
to directly determining proper analyses that satisfy an inverse structural 
index it is possible to parse a continuation string resulting from the application 
of an inverse transformation up to the sentence symbol, using the augmented 
CF grammar^ the resulting structures can be examined to insure that they 
satisfy the labelled bracketting structural condition and any other conditions 
of the forward transformation in question! and that forward transformation can 



actually be applied to insure the validity of the inverse transformational step. 
The mechanics of such a step are illustrated by the diagram which appears at 

the end of the paper, • .. . . 

It must be noted that even though a general labelled bracketting struc- 
tural condition is allowed, a structural index must stiir be identified by means 
of that labelled bracketting, and a structural change is specified oni v through 
the use of that structural index gt as befor^ : 
Chomsky adjunction presents no particular problem because an inverse 
structural index and inverse structural change may still be mechanically com- 



puted. 






The enrichment of a transformational grammar to include the use of 
coordination-reduction rule schemata is discussed in reference ,<14>. For- 



problems. They do, however, offer enormous opportunities for the prolif- 
eration of spurious continuations. This, in turn, will probably require even 
more careful modification and tuning of a grammar, to keep the analysis time 
within acceptable bounds. 

Finally, the addition of obligatory transformations was fully discussed 
in reference <1D> but was not programmed at that time. A presently existing 
transformational analysis program incorporates those considerations. This 
program has not yet been extensively tested and hence must be considered to 
be in a ’'debugging” state. For this reason it has not yet been described in 
the literature . Because it has not yet been tested on any sizeable grammar 
it is not possible to estimate running times, : It is safe to say that the analysis 
of sentences with respect to a good-sized transformational grammar currently 
under development at the IBM T, J» Watson Research Center will undoubtedly 
require careful analysis-dictated modification of that grammar. In addition, 
the analysis procedure itself may well have to be significantly modified. The 
principal hope is that by actually performing forward transformational tests 
on the fly, spurious continuations can be avoided before they exponentially 
proliferate. 

It is clear that at this time it is possible to produce transformational 






grammars (perhaps not uniformly well-motivated linguistically) which exceed 
the computational capabilities of the existing program. This, of course, limits 



grammars can be modified so as to permit syntactic analysis in a reasonable 
time for experimental purposes. For the purpose of such applications as 
question answering systems, information retrieval systems and natural lan- 
guage programming systems this may be less of a problem than the problem 
of writing a grammar that specifies a sufficiently large subset of English, 

;\ ■ 

f V , - - 




Set of Surface 






Structure Trees 




A. .. . 



v Proper 
Analysis 



Set of Inverse 
Proper Analyses 



Forward 
T ransfo^mation 
Check 



■Y 



Inverse 
T ransformation 



Sub— trees 
from Inverse Transformation 



r 






Derived Constit. 



! : Vi 



l Struct . Removal 



/ 



Sub-trees with Derived 
Constituent Structure 
Removed 



i 



V 






m 



II CF Parsing / 

■ • ■ ... v '• ■ If -vf • ' 

J L 

•- ‘.:-T *• ': ... ; a- py.'*?- "A" “ A. ; vH)VT vij.; v.&-; = y • • A 

■ Set of Surface ' '• ; ■■ -j ■" V 

set or surface c Ss \ 



I f 
















mm 






• • - v. ; 






:rsE: 



; -, r : ■ ; V.';- \ v .r! v-.-/— r: .V --A 

' * '■ 

T\r ; . AT - A ,\tT 






mm 



iSFORMATI O 



H 



iS 






%?£0 



Y-mr- 






i w&SM 



m 



mm 



ifl 









1ft 



KSwlMl 






1 . ■■ va , : - - 






References 



<L> Zwicky, A. , Friedman, J, , Hall, B. , and Walker, D. The Mitre 
syntactic analysis procedure for transformational grammars, Proc. 
Fall Joint Computer Cortference, 1965, Spartan Books, Washington, 
D, C. , pp. 317— 326; 

<2> Kay, M. , Experiments with a powerful parser, Proc, Deuxieme 

Conference International sur le Traitement Automatique des Langues, 
Grenoble, Aug, 1967, Paper No, lO, 

<3> Simmons, R. F., Burger, J. F. , and Long, R. E. , An approach 

toward answering English questions from text, Proc. 1966 Fall Joint 



Computer Conference, 1966, pp. . 357—363, ; V.' v \ v ' * i. .V 3’ 

Moyne, J. A. , Loveman, D, B. , and T obey, R . G. Cue : A preprocessor T- ; -vv 



system forrestrictedynaturalEnglish, Proc. Symp. on Information 












<8> Wiriograd, T. Procedures as a representation for data in a computer 
program for understanding natural language, Rapt. Al—TRl, Artificial 
Intelligence Laboratory, MIT, 1971. 

<9 > Kellogg, C . , Burger, J. Diller, “i . , arid Fogt, K . The converse 
natural language data management system: Current status and plans, 
Proc. Symp. on Information Storage and Retrieval, Univ. of Maryland, 
April 1971, pp. 33—46. 



<10> Petriok, S. R . A recognition procedure for transformational grammars, 
Ph.D. thesis, MIT, 1965. ' 

<H> Chomsky, N . Aspects of the theory of syntax , The M.I. T. Prt' a, 
Cambridge, Mass, , 1965 . ; T 

< 1 2> Retrick, 3 , R , A program for t ransformational syntactic analysis. 



Air Rdree Carnb. Resi ■ Uahs. Research Paper No .: 278y?;^ 



;■ j 

. - ill: 



;W|_ 

: y 
tI. 



<13> Keyser, S, J. and Petrick, S. R. Syntactic analysis. Air Force Camb. ! 



Res. Labs. Research Paper No. 324, AFCRL-67-0305, May 196^* .. . 

<14> Petrick, S. R, , Postal, P. M* , and -Rosenbaum, R. S. On coordination \ 

V ; l'-- 1 , .'■* r : • . •• -- :>.*■ ' • • ‘ a - -• , *’* t * ‘ ;iv. TV - V-\. • • ••• ’ ■ ' V V “ ^ %-A . \\» .■*../ : , .. - V V ‘ ‘ .. •••■-. V .. *.} •{ 





Syntactic Analysis Requirements 
of Machine Translation 



by 



S. R. Petrick 

IBM T.J. Watson Research Center 



Sji-: • ‘i- • ' r>. : > ••-•.••• .f : ■ ;■ 



. . . 

; - •• fc 

v- 4 * ?* JLT‘ h: * t; . . > V • •• ; 












."tv" Wv, 






:.y y-. 

■V; i,VrV:-.t • 



■ v- •' >. '.;vy ; • 



j • j . . . . . ‘ ; . 

4 ••• . • . ■ > •• - 
! - ■ - • • ' •■■■--■ ' 

: . ■ - . ' ; ' . "y ‘ ■ •• • . 

•1 A v- '■ A A-V/.- ■ ','••• ■■ ■ -.-•■••• v'.: -.V • ... 

a; yyv;.-;- , ■yyy .-y.V;;:-: ■=■-.;/■ y: 

. • • , • f-:. , - ' • * • - < .-•.*• • ** • • ' r ‘ =:V: >->.*• v ■_.-■•• , •. - •••.*!-•, ,• ;,.- r • .. .?••••'- 4 .r . ■•:■ v \ *y.y--'*y - ; V ^ - »f ■/ .*■ ;■ ; • 

:• .. -• y A ' \ • . •• ‘ ■ . . . •. • •• ' . • 






SYNTACTIC ANALYSIS REQUIREMENTS 



OF MACHINE TRANSLATION 



S. R, Patrick 



In ih note I will confine my attention to machine translation 
systems whi<. i based upon an underlying formal generative grammar. 
This is not to ne potential importance of various computational aids 

to human trans -ton, nor to deny the possibility of machine translation 
not based on a formal grammar. It is clear, however, that for fully 
automated MT any attempt to make use of presently existing linguistic 
theory or of that which is likely to exist in the foreseeable future requires 
a grammar-based approach. 



A second assumption I. wish to make is the existence of two distinct 
components of a grammar - - a syntactic component and a eemantic cora- 
ponent. The former assigns structure to sentences and the latter interprets 
those structures by translating them to a natural language (in the case of 
h'lT) or to an artifical language which has its own dorri'mit#* i* t* 




The importance of the syntactic component has been recognized for 
some time. For the purposes of MT it has two distinct ends to achieves 
on the one hand it must specify a large enough subset of the source 
language to meet the operational requirements of the MT application in 
question, (The related function of ruling out syntactically ill-formed 
sentences is of limited importance in MT), On the other hand the 
structures it assigns must provide a reasonable basis for semantic 
interpretation. These two requirements are closely related, i, e. , it is 
relatively easy to satisfy one at the expense of the other, but much harder 
to adequately meet them both. 



A not uncommon attitude which has been expressed both in the 
computational linguistic literature and orally at symposia and conferences 
is that syntax in general and syntactic analysis in particular has been well 
worked over, is thoroughly understood, and presents no serious problems — 
in contrast to the situation in semantics where little has been done and not 
much is understood, I submit that such remarks reflect the experience of 
one who has chosen a class of grammars, in most cases context-free 
grammars, which permits a reasonable coverage of a source language at the 
expense of assigning structural descriptions which bear little relations Hip 
to underlying meaning and which, therefore, provide an inadequate basis 
for semantic interpr etation. It is not just because large - coverage context- 

free grammars have been found to often assign 100 or more structual 

■ . ■ ® > - ; - ■ v v - ■ - " • ■■■: ■ v ‘' : ■ ■ Y j ;■ ■ -■{ •: : v : ' : v ;:V 7 v '■ . ’ : ■ * • -''Y.-'l.;-' 

V . ■ :• ‘ ‘ 1' i .w'v.;. ' v / ■ V v v -v'- ' - v^v - - v - • • . „'V • - ' • / •• •• 'V _ ,V - •• \ 

descriptions to unambiguous sentences that makes them, inadequate. Kather, 




■asscont 



; This shortcoming is riot limited to thef class ;of>cp 
If the rewriting systnna.U 

and/or rewriting rules with whose ; constituents; complt 

l * >“ f >; l • - - 





•• •• ••• _ : ■ •' •' • V ; • • 

t-free grammars. 




tive grammars 
ran be 

I—”- 11 - LJ r ‘ : :• • " 



yzy.^fy 



generalizations are realized, but 
r to meaning: appears intractable , 








semantic interpretation deep structures which were in many cases far 
removed from surface structures. Chomsky made use of a transformational 
component to relate corresponding deep and surface structures, but the 
acceptance of the deep-surface structure distinction is a matter which is 
independent of any consideration of the most appropriate means for making 
explicit that correspondence. Accordingly, a host of models (each of which 
is a proposed linguistic theory even if not called such) have been proposed 
for mapping surface structures into corresponding deep structures, or (in 
some cases) for directly assigning deep structure to sentences without 
explicitly producing surface structure. 



It is my contention that linguistic models which do not provide the 
deep structure of sentences (at least implicitly if not explicitly) fail to 
provide a basis for the semantic analysis of all but a small class of 
sentences, a class so restricted that its use is precluded for most 
applications including MT. Hence, for the remainder of my discussion I 
will focus my attention on the problems of syntactic and semantic analysis 
associated with some type of deep structure model. 



As pointed out previously, there is a trade- off possible between syntax 
and semantics. If more is done by the syntactic component the task of the 
semantic component is lightened and vice-versa. Contemporary linguistic 

theory has been much concerned with this question of where to draw the 

' - ' ■ ■’ ■ , ' ' - ■ ' • " ' - ■ s ~ "■ ■ - - - 

line, and even though the questions of overall simplicity considered have 

not been m any concern; fpr 

consider the applicability to.MT .of, models of , pre sent - day deep structure 




semantic analysis and/or translation requires not only a number of deep 
structure distinctions but also a large amount of information about the 
world, about logical deduction, and about the context of discourse in 
which the sentence appears. X am resigned to the prospect that these 
obstacles preclude for the foreseeable future extremely high quality 
translation, V 



My own experience with semantic interpretation has been with 
translation to a formal language which, although not a programming V 
language in the sense of having an existing hardware or software inter- \ 
preter, is close enough to a programming language that the task of 
translating it to an existing programming language is an easy one. The 
problem of translating a given structure to a functional programming 
language appears to me to be greater than that of translating that structure 
to another natural language. This follows from two considerations : First, 

deep structures of different languages which have been proposed to date 
are remarkably similar. In those cases where differences have been argued, 
they have seldom exceeded differences in subject, verb, object ordering. ' 
Deep structures differing so slightly are easily related through the use of 
such standard translation mechanisms as the Irons Translator * . ’ ' •' 






Second, the task of using a transformational grammar to convert 
deep structures into surface /'structures * is n ot c on c eptually diffi cult. 

Hence/ it wbul^appear^ that for ^ a^ve fy?Iarge^cihs S' df s ententes] ' fhe" 1 ;^ 

. translation sequence shown below; should provide the basis for translation; •' 

•- • i • • _ ■ • • •-. , •••. . .• : • :• > '■ '••• ••••• :: •• -r. r - r - ■. - «• • • -• 1 V- ■■■■ .■ ■ — •. • >- • • -• - = 

' * ■ t \ ~ j ; - ' • * ‘ t ■ * * . ’ • ; 




' , - ' "* ,v ■ - ' * r < r c. \ “•* ^ •• / -4 ‘ ^ 1 V , 4 * '* r - 1 *. • : v ; : . 

language question answering ;sy stem: ^ This has also, been.-arguef 1 ^ " ” - ' ^ --■«*-* - 







; ■' '■ ■ v" • - 



I have argued that the use of a deep structure (read semantic 
structure if you prefer) generative grammar does provide a reasonable 
basis for MT. It does so, however, by throwing a considerable burden 
on the syntactic component. We have seen that structures can be 
assigned which appear adequate for the purposes of MT. But what of 
the coverage requirement, i. e. , that a sufficiently large subset of the 
source language be specified? In addition, we must concern ourselves 
with the theoretical arid practical requirements of syntactic analysis for 
a class of grammars that is'sapable of assigning adequate deep- structures. 



1 will discuss these two considerations with respect to generative 
transformational theory and also, more briefly, with respect to other 
deep structure-based linguistic theories. 

Let us first consider the matter of coverage. It is, of 
course, the case that most transformational studies of syntax do not 
supply completely specified base and transformational component rules 
in discussing syntactic phenomena. There have been, however, a few 
attempts to write a completely specified set of rules within/a well-defined 
transformational framework^' ;5;^6, 7y8; These efforts establish a lower 
bound on coverage which can be achieved. without sacrificing structural 
adequacy. It is . somewhat difficult to characterize the coverage achieved 










for handling verb phrase complements, pronominalization, preposition 
segmentalization and raising, indirect objects., relatives, genitives, 
negatives, certain time and place adverbials, etc. Similarly, in a more 
recent effort at the IBM Thomas J, Watson Research Laboratory a trans- 
formational grammar has been produced which generates such sentences 
as : 

what companies had a profit which was more than 
ten million dollars ? 



and print the one element of the set which contains M which is atomic 
and provides such construction types as: yos-no and Wh- questions, 

passives, prepositional phrases, nominal structures formed from under- 
lying abstract verbs, restrictive relatives, possessive genitives, and 
certain types of negatives, comparatives, and coordinate structure s . 



Now just as existing grammars establish a lower bound on coverage 
attainable there are several considerations which suggest upper bounds 
for at least the foreseeable future, /For' '.example; many syntactic 



phenomena may be identified which have not yet been studied by anyone. 
Many other phenomena have been studied, but the results have served 
more to show the existence of substantial problems than to offer compelling 
and widely accepted solutions . Examples her ear e plentiful and include 
coordination, gapping, s and prohomihalization as well as almost every ' : : 



syntactic pnenomeiiidn which has been studied to some extent* And^^ finally,; 

' : ■ ■ • - • - - " ' - .. j '- *• * .? -L. -| •... 



exp e r imen tal work c on du c t e d to date shows that it is far from trivial to 
put together and test grammars tHat provide for such relatively well 

i _ -i „ ^ l a Tro 1 



understood constructions as yes - no questions , WH- questions, restrictive 




. ...,, ... , . , . . .. : : ' . p .• . ' i 

* ‘ ’>29''"-'* -j- 









and representative for some application and comparing their syntactic 
requisites with the facilities offered by any existing or proposed grammar 
I have seen this operation carried out at the MITRE Corporation with 
respect to a command and control question answering application and have 
myself undertaken the same task for a formatted file question answering 
facility. The results were the same. Very low coverage was observed; 
certainly less than 10% of the sentences studied were covered even 
allowing for lexical addition and extension by including some rather 
obvious additional transformations. The saving feature in the case of 



natural language question answering systems or natural language pro- 
gramming systems, however, is that they need not process unconstrained 
input sentences. Instead the user can be constrained to and instructed as 



to how to limit his input in terms of both lexicon and allowable constructions. 
All that is required is that natural subsets provided must be learnable by 
human speakers and must be rich enoxigh to permit expressing that which 
must be expressed in a convenient fashion. The attainability of even 
these requirements remains to be established but at least offers some 
hope of success. On the other hand the usual situation with MT is that the 
input is not produced with tlie-limitations of a particular-formal -grammar 
i n m i nc i' ^^i® » more than any other single factor, convinces me that 
grammar-based MT offers little hope for practical usage for at least the 
next ten years. This is not to say that MT is not an interesting and 
productive vehicle |f or keeping linguistic research in both syntax and 
semantics tied to reality. Others might disagree with this assessment, of 



cour s e. 









There may be a few MT applications where time and economic con- 

siderations permit. the phrasing or rephrasing of source. sehtences by 




Thorne^r Moyne?- ^ Kellogg;^ Kay^ ^ and Simmonsf^ we are faced with a 
difficult task for a number of reasons. Many of these models have been 
used only sparingly for the specification of any natural language. Hence, 
there is little to go on in assessing the coverage of these models. In 
addition, those models for which one or more large grammars have been 
written have not been documented in a way and to an extent which makes 
the determination of coverage feasible. Alternative clarification of 
coverage via sample sentences and listed construction types presents the 
same problem as we observed for transformational grammars, but 
whereas most linguists' are by this time familiar with transformational 
formalism, this is not true of the aforementioned analysis-based models. 
Therefore, their coverage can at present be estimated only by their 
originators. It is far from clear to this observer that these approaches 
offer the same independence of construction types as is achieved by trans- 
formational theory, In any case, none of these models have supported 
claims of greater coverage than that afforded by current transformational 
theory. It is important to note that although these models are often 
described as " transforma tional" by thcir originators, they have not been 
related to transformational theory and hence must be judged on the usual 
grounds of linguistic adequacy just like any other proposed linguistic 



theory. * "• 



The remaining consideration is the 



and pra.ctial requi re - 



ment s of s ynta cti c analy a i s for a deep structure - • specifying cla ss of,;; 

■ ' r : i r:FK’ ' " 7 ; . ,■ ' ■/■=; .!;■ ■' 0 ■ ■ ■ : . - ; ■ ■ v. - ' .■■■-■' - 

grammars. For those. analysisV based grammars previously mentioned 







The situation is quite different with respect to transformational 
grammars , There is no shortage of work in linguistic description through 
the use of transformational grammars, although it must be noted that 
most efforts are directed toward determining the allowable class of trans- 
formational grammars rather than toward developing in detail any one 
comprehensive grammar. Syntactic analysis for any class of transfor- 
mational grammars is a very complex and time- consuming proposition. 

It is probably for this reason that most workers in computational 
linguistics have chosen to forego conventional transformational theory in 
favor of an analysis-based alternative. 



There have been only two computer implemented efforts on trans- 
formational grammar syntactic analysis. One, carried out by the MITRE 
Corporation, was limited to a particular grammar; a syntactic analysis 
program was tailored to this grammar. The program appeared to be 
successful in producing desired structures in a reasonable time, but it 
was never established that this program invariably found all of the 
structures as signed to a s entence by the particular transformational 
grammar in question (i. e. , that it was, in fact, an analysis program 
for that grammar). 



In contrast to the MITRE approach, Ptetrick^.J defined a class of 
transformational grammars and found a syntactic analysis algorithm 
that is valid for members of this class. The extremely nond^terministic 



nature of this algorithm made unfeasible the treatment of grammars as 

written by a linguist unfamiliar with the analysis procedure. However, 
Kirk and Keys er 6 showed that by suitable recasting, a. substantial portion 
of an existing grammar (due to Rosenbaum) could be used for syntactic 
analysis 

’ 



. • 1 ■ ..k)_ . : iV ; . y i-j,- -• ^ j; . _ _ _ . 

serious difficulty in transformational grammar syntactic analysis. The 
class of grammars for which syntactic analysis algorithms have been 







. 351 * -■ ■ • • 






t 

1 



static, and at any given time there is little agreement on just what should 
constitute an allowable class of transformational grammars. In reference 18 
we give an account of the current status of syntactic analysis for trar s- 
iormational grammars. In summary, it can be stated that although the 



class of grammars for which syntactic analysis is possible has been signif- 
icantly extended, the introduction of new variants of transformational 
theory has more than kept pace with theoretical and programming efforts to 
cope with them. Consequently, any given linguist, would undoubtedly find 
that his rules and assumptions do hot correspond perfectly with the formu- 
lation of the allowable class of grammars. Nevertheless, it is hoped that 
this class is now extensive enough to permit recasting of current trans- 
formational grammars into an acceptable form without seriously com- 







'yv 



; .■ . ’ , / . ■' ?;•; y - V-fi '•••’, - 7 V 7. . 



„ * , t. ■ . „ *-r! **\’. : ** /.***’ i';! •' •; ' *;. T * t * ••*. ., 












~'*i jy-V' 






REFERENCES 



1. Irons, E, T, A syntax directed compiler for ALGOL 60, 

Comm ACM 4 (Jan, 1961), pp. 51 - 55, 

2, Knuth, D, E. Semantics of context-free languages, 

Math, Sys, Theory 2 (1968), on. 127 - 145, 



3, Patrick, S, R, On the use of syntax-based translators for 
symbolic and algebraic manipulation, Proc, Second Symp. on 
Symbolic and Algebraic Manipulation, Los Angeles, Calif, , 

March 1971, pp. 224 - 237 (Also IBM RC326S) 

4, Zwicky, A, , Friedman, J. , Hall, B. , and Walker, D. The 
MITRE syntactic analysis procedure for transformational grammars, 
Proc, Fall Joint Computer Conference . 1965, Spartan Books, 
Washington, D. C, , pp. 317 - 326. 



5, Rosenbaum, P.S, and JLochak, D. The IBM core grammar of 
English, Specification and Utilization of a Transformational 
Grammar, Scientific Report No, 1, (IBM Corp. , Yorktown 
Heights, N. _Y.;, i'1966) .. '.V 






... 



6, 



Keyser , S, J, and Kirk, R, , Machine recognition of trans- 



formational grammars of English, Air Force Cambridge Res, 
Labs, final report No. 67-0316, Jan. 1967. 









7 . Rosenbaum, P. S. , IBM English grammar II. Specification and 

r - s •:'* &t : ,n i ^*r:- hi ■ "s- rfL.T. *r. .? >WU-i • i- 1 :-, •\vrr r tv* * ; '-i“ r.~‘ *Y- ♦* trYh'. *r i \:--rZ J:‘ A >- ' , ... 




8. Stockwell, R. P. , Schachter, P. and Partee, B. H. Integration 
of transformational theories of English syntax, USAF Electronic 



".v - . 7 - . •' • • ■ . >- ’ c ■ 1 : ’ • , ■ . • • _ '■ .• - V ■ - . ‘ 

v;r:-;. ' . - . 1 " ■ .. ‘ '■ > & ^ £ • ' ' • ' - . . 

7 ‘ ^ ' 

V ! ,,;y. / J -'L' : : SX”. Ly 1 ' .y y L y'v;:;// y:..'; .y L:y,LV; xLLL'iiX:, ■ 'W.:: ? L ' f.. . ■ V L.' I:...::;.;.;;.' . 

' - ! , - . • 

^ . •. •. ' ■ '> , ’ " * ‘ 1 -y' ’ 1 t ■ ‘ ' 

: - . . . • ' /*Q / ‘ 

-j s' L u4 

■ ' \ ■ v .■< <- ‘ - : J 

“-'Jr r-- : J- .j •: y:; ; - 









9, Woods, W, A. Transition network grammars for natural 

language analysis. Comm. ACM 13 (Oct. 1970), pp. 591 ~ 606. 

10, Winograd, T, Procedures as a representation for data in a 

computer program for understanding natural language, Rept. AI-TRI, 
Artificial Intelligence Laboratory, MIT, 1971. 



11. Bobrow, D. G. and Fraser, J. B. An augmented state transition 

network analysis procedure, Proc. Internat . Joint Conf. on 
Artificial Intelligence, Washington, D. C. , 1969, pp. 557 - 567, 

12, Thorne, J, , Bratley, F, , and Dewar, H, The syntactic analysis 

of English by machine. Machine Intelligence 3, D, Michie (Ed. ), 
American Elsevier, New York, 1968. 



13. Moyne, J, A. , Loveman, D. B. and Tobey, R. G, , Cue; A 

preprocessor system for restricted, natural English, Proc. Symp. 
on Information Storage and Retrieval . Univ. of Maryland, April 1971, 
pp. 47 - 60. 



14. Kellogg, C.", Burger, J. , Diller, T, , and Fogt, K, The Converse 

natural language data management system; Current status and plans, 
■Proc . .Symp. on Information Storage and Retrieval . Univ, of Maryland, 
April 1971, pp, 33 - 46, :i ’ : 



15. Kay, M. , Experiments with a powerful parser, Proc. Deuxleme 



s, 



Gr en obi e , Aug . 1967, Pap e r .N o . 10 . 






t Wi? 'M, 






16 . 



An approach 






' - - - T - -1 f » * ", •,* % ' ‘ ! ,v, ‘ - L , 

Simmons, R. F. , Burger, J. F. , and Long, R. E. , 

from text. :j Proc i 1966 Fall - 









- • ' • '■ Ifliat Computer Conf;', 1966, pp . 3 57 - 363. t _» -v:; 

- . ' 

■ -r «: r-N". V- • ", % .'v *'• ; r.-~ v- i ; : • • V~ : '■ *'■ ' r.-r. ■ A i • - - 'v .• • i;! • ■ ^ >• , ■<:£ •• v . ■ • r ' * v . •. ■- - • ■ ■ . . 

. 

17. Petrick, S. R. , A recognition procedure for transformational grammars, : 

Ph. D. thesis, MIT, 1965. 

- 

■■ • • • .'V ; - /- :• 1 - ‘ 'V •• . v- : ■ ’ ■/ v.v ' .* f ; • ’• ■ 1 - -■ ;i: 

’ . - 'U U “ ... * - ■ - . 

35U 

, . , - . . .. V i! V : 

... . . - - -• - . . - n 












ftmcg 

- . • ' : • .V.: - - ' - ; J ■ • v . '..:L - , • . , ' 



18 



Petrick, S, R, , Syntactic analysis for transformational grammars, 
Proc. of the Conference on Linguistics, The University of Iowa, 

Iowa City, Iowa, Oct, 1970. 














■■ ! i. 1 IK liw-/, ii L- 



APPENDIX 



Analysis o£ E4 &*Le,g£ e-tne, QKQ&&Q. Anz<ih£, von EZ.(Lm 2 .n£&n von,. 



by Annette Stachowitz 



Linguistics Research Center 
The University of Texas at Austin 



•i-' i-.V*. •' 






:r:; . 



: :■ . -y. . ; =r.‘.Vv . .'-v' v . '.V . ~ ' ... - i.’ 

r ■ '■ - _ ‘ . 

■ ■ ■■ •; ■■ , . ' . ; ■ ■ ; . . - 
V •' • _■ - ; ' ' "" ^ ' . • • iv-.W . " . • • • ' . A."-:-.. • \ •: / . 

i-.!. >- v 'yVwv-y:' ^ tvti. ;c.- ( . f-./.sy. .1 . . . 7 ^. »■ ■ 













n n T "llv r 7 v ; 'i 




GO 



CL Q 



ad ad <-> 



CO ^ u% ^ — * 

^ ZD H? o ^ =5 ^ nz 

Li. 2 W o' w •"■ U- U < 



Q < => >- 

CJ u z h 



zy cjh 



Rul es 



01 


V EXPLET 


m 


* 


02 


V V 






* 




+ CLi 


(27) 








+ PX! 


(..'25'..)/ 








+ GC! 


[ • • ' A ' , , )/ 








+ T 0 1 


:..'a'..)/ 








+ TSI 


: . . 'ab ■ . , ) 







02 V V 

+ CL(9) 

+ PX( . , 1 25 ' 
+ GC ( . . 

+ T 0 ( , , 

+ TS( , . 



'A*. 
' A ' . 
' AB ' 



)/ / 

?) 



* LAG 



(analogous 02 rules for the stems 1 aeg , leg ) 



03 



V END 
+ TY(T) 



* T 



04 



C5 



V DET 

+ gd(f)/ 

+ CA(NyA)/ 
+ NU(S ) — 

+ INCS) 



V 

+ 



A 

CL (7 ) 



* El NE 



* GROSS 



06 



V END 
+ TY(E) 



* E 



07 



V N 
+ CL (10) 
t GD F) 



* ANZAHL 



+ TY(QU) 

; -- • • ■ - • ■ .■ • , ' . ' ;/ 



- 

V-' -ulV ‘-‘ii' ■ - Li ^ 

_ .. , . .... 

08 :;. v P rep 






' :: ‘.r-V- _ 



* ELEMENT 



09 

CIO 

Cl 1 

Cl 2 
Cl 3 

Cl 4 

Cl 5 
Cl 6 

Cl 7 



Cl 8 
Cl 9 




V N 

+ CL (11 ) 

+ G D ( N ) 

+ TY ( AB+CN ) 

V END = * EN 

+ TY(EN) 

V PRFX - * VOR 

+ P X ( 2 5 ) 



V PRD - * . 

V PRED - V V V END 

+ PS (3 ' 2 ) / $ CL(..,27) $ TY(T) 

+ NU(S ’P) B 

+ TN(PR) 

+ VC (A) 

+ MD ( I ) 
a 2 



V ADO 


- V A 


V 


END 




+ GD(M'F S N)/ 


$ C L ( . . , 7 ) 


$ 


T V ( E ) 




+ CA(N ' N ,A)/ 




B 






+ NU ( S ) 










+ IN(W) 










V NO 


- V N 








+ CA(N,G,D,A)/ 


$ CL ( . . ,10) 








+ NU(S) 










A 2 










V NO 


= V N 


V 


END 


■ ‘ ■ ■ = 


+ CA(D) 


$ CL( . . ,11 ) 


$ 


TY (EN ) 




+ NU ( P ) 




B 






A 2 










V NP 


= V NO 








+ PS (3) 


$ NU ( P ) 








$ 2.1NU 










a 2 r--\ ; 




. . J V 




; ' ■ 7 ' v ’ •• ■ -V-'-- • 


V PRPH 


= V PREP 


:>:¥v 


NP 


;•%; ;.; y V yV J= 


A 2,3 : 


. 3 . 1 GC 


:vB 


CA 


■ + " •' ; : : : = 




•* \ • 


gp| 


GD 


i;* : - ,v i '• / '} v.: 


V NP 


= V DET 




ADJ 


V N0 = 


+ PS (3 ) 






4 . 1 GD/ 


$ CGo?f r : ^-; 


$*2MGD/s»,s-^£ 




:V"i ;N.V- j- 


4.2NU/ 


$ NU/ 


$*2 , 5NU/ 


$ CA 


" ‘ V 


4.3CA 


$ ca 


$*2 . 6CA 


. 3.1 , W 2 . 1 / 


ill$l 


IN 


• •: •> ■ •• ; \ 


a 4 


. 3 . 2 , W 2 . 2 / 


..r ,>*;• 


; . "VT TVVV 


: : - ^ ^ . . ' , ■. ■ ■ - 


■ ' ' ;V ': 


. 3 . 3 , W2 . 3 




••• •;. • • -V,-; 


.T v TT " : : ' T' v ' '• r ‘ -v. ./; r.-r‘ . 




. *3 . 4 IN 




' V-.';' ' 


v-T‘T "" ‘ ■ • • :<■ ‘ -:l y ■■ T.-/.-, •> 


■ \ . T ' = / ; . . . : - V ' :V ' ' 


•• •’ • : •. •• • '• ■ : • 




, • 




;■ -V-;. ■■■; ^ 


359 




:\y. , y: y 


. - . : ;■ - ; 


t 1 ■ ./ V. - 


- 40 

1 1 




'a, 


Vi.;-; : vv , . . - V : v . . ; ■ . ; . 



C20 



C22 



V 


NP 


- V 


NP 


V 


PRPH 


/s 


2,3 


$ 


TY(QU) 


$ 


PR(54) 










$ 


NU ( P ) 


V 


CLS 


’= V 


PRED 


V 


NP 


+ 


TY(INV) 


$ 


PX 


• 


2 . 2 P S / 


$ 


3.4 


$ 


PS/ 


m 


2.3NU 


$ 


2.5 


! 


NU 


a 


2.4TY 






$ 


TS 


? 


PRN 






| 


TN 

VC 


$ 


CA ( N ) 






$ 


MD 







V PRFX 
. 2, IPX 



ec.Cx) 

The subscript PRN in the NP constituent Is added to 
the clause label only if NP dominates a pronoun: 



V NP 
+ PRN 
a 2 



V PRN 
$ TY(PS) 



C2 3 



V CLS 
a 3 



V EXPLET 



V CLS 
$ TY(INV) 

* PRN j 

(This rule specifies that a clause with inverted word 

order may only be preceded by an expletive es_ if its 

subject is not a personal pronoun: Es kommen drei 

Personen in Frage . But: * Es kommen sie in Fraoe .) 



C24 V SNT 

$ 2.1 



= V CLS 
$ TY(DC) 



V PRD 
B 



This analysis may show the difficulties that have to be 
accounted for i n the analysis of surface strings? wi th context- 
free phras e structure rul es . Apart from th e problems of- 

of elements in the surface structure and 6f 



phrasal dictionary elements, the amount of information in 




restrictions or feature packets it is associated with, 
(Feature packets may include separable prefixes, case 
government Including prepositional objects governed, 
types of objects and subjects required, etc,). For 
example, the German verb liegen may be associated with 
30 different feature packets , resulting in 30 different 
readings of which a few are shown here (these translations, 
with a few exceptions, are taken from Wildhagen and 
Heraucourt, German -Engl Ish / English-German Dictionary, 

VoT. II German -Engl ish, Brandstetter Verlag, Wiesbaden, 

1 957) : 

1, liegen, intransitive, requiring a physical object as 
subject, with a locative adverb: to lie, to rest, to 
be located or situated; 

2. liegen, governing a dative object .which must be human 
and wi th a subject which must be abstract: to suit 




because of the following two characteristics: 
a) Rule constituents are only subconfigurations of work 
space configurations, i.e. only the features relevant in 
a particular rule are mentioned in that rule while all 
others are disregarded. For example, rule Cl 3 (p , 3) 
only states the condition that a verb stem must be 
classified as belonging to the paradigmatic cl ass 27 in 
order to be concatenable with the verb ending -t, thus 
forming a predicate with the indicated features. The 
remaining properties of the verb (prefix, case government, 
type of object and subject required) are irrelevant in this 
concatenation rule and are merely "carried up the structural 
tree" by means of the operation specified by the symbol s a 2 
on the left side of that rule. 



b) Agreement and government are specified as set theoretical 
operations between the values of rule constituents. For 
example, rule 019 (p. 3) very generally states that in a 
German sentence the sequence determiner-adjective-nominal 
should be analyzed as a noun phrase provided that they 
agree in gender, number and case, and that the adjective 
and the determiner must not agree i nl.ty pe of inflection 
(weak or strong) ^ These conditions are expressed by the 
operations specified in the second and following lines of 
each constituent of this rule. (All other features of the 



nominal head are not specif i cal ly menti oned i n the rule 
and are simply carried up th e tree.) Thus, very large 
numbers of rules can be represented by one rule in this 
subscri pt format . Thi s makes it possible to incorporate and 
refer to the 1 arge amouht of information necessary for 
analysis and translation in the di etionary and syntax of a 
s urface grammar. Access to this i nformati on availab 1 e i n the 
surface s t r i n g wo ul d b e v p r a ct i caTTy i mpo s ; S i bl e with: a context - 
f ree ph rase s truct ure grammar Jwi-th s impl e symbol s: because of th e 
unmanageabj e number of l exi cal cl asses and morph 01 ogi cal 









o ? 

eric: 









■m.:: 






and syntactic rules building on these classes. 



In spite of the greater economy of subscript rules, 
however, problems resulting from permutations of elements 
of phrasal and idiomatic expressions cannot be easily 
solved in surface analysis. For this reason, the analysis 
of sentences containing such elements is, in practice, 
performed in two steps at the LRC : surface analysis and 

standard analysis. In standard analysis the elements of 
phrasal and idiomatic expressions are re-ordered to a 
pre-determi ned standard order and are then treated as one 
single dictionary item, possibly with internal variable 
slots. A detailed description of standard analysis may 
be found in Research In German-Engl i sh Machine Translation 
on Syntactic L evel , Final Technical Report, RADC-TR-69-368 , 
Volume II, August 1970. 

The following is an explanation of the symbols used 
in the structural tree. The symbols are defined going 
from left to right in the sentence and from the bottom to 
the top of the tree . 



Lexi cal level : 
EXPLET 



Expletive es ; not a pronoun but rather a 
syntactically empty placeholder for the 
subject of the sentence. 



CL iZ7) ~ This verb of paradigmatic be 

PX(25 1 ... )/ used w ith any of a number of speci fied : N 

TO ( x ' * 1 separabl e pref i xes / among them prefi x 25 , ;: 

TS(AB ' which is the German prefix vor . If it is 

used i n conjunction with this particular 
'prefix.,' i t is i n trans i r ti ve (governs case x ; semantic type 
of object x) and takes a subject of the seniantic class 
type ajbs tract. . 

' ■ : : ■ • • . ’ - 

. . - ' • • • . . 7 - 

■■ '■■■ ■ ■ - ■ ■ ■ - . i' -■■■■.- v . . ’ - - ;• 

- • ' — ■ ■■■■■■ v -:■■■■. : . • . ' K O ^ 7 - ■ * ... '• ' 

• - - " '• -■ ■ ' ■ ■ ' - - ' ' 






END 
TY (T ) 

DET 

GD(F) 

C A ( N , A ) 
NU (S ) 

I N ( S ) 

A 

CL ( 7) 

END 
TY (E) 

N 

CL (10) 

GD(F) 

TY(QU) 



PREP 
P R ( 24 ) 
GC ( D ) 



N 

CL (11 ) 
( 



GD ( N ) 

TY ( AB+CN ) 

END 

TY(EN) 

PRFX ,:T ^ ? 

P X ( 2 5 ) 3 



PRD 



Ending of type -t_ 

De te r mi ner, gender femi nine , ambiguous with 
respect to cas e , 1 . e . it may be considered 
nominative or accusative, number singular, 
s trongly inf 1 acted . 

Ad j ecti ve of paradi gmatic cl ass 7 , 



End ing of type 

Noun of paradi gmatic class 1 0 , £ender 
feminine, type quantifier, i.e, a quanti- 
fy i ng noun" Which may be followed" by a von 
PRPH and then constitutes a modi f i er of the 
head noun i n that PRPH v’h '■? . >• 

The p rep os i ti on is identified as p repos 1 tl on 
number 24 ( von ) and has the feature "governs 
: c_ase dati ve" . 

A .noun of the paradi gmati c cl ass 1 1 „ £ender 
n_euter, a n d s e ma, n tddjyty p e a bst.ra ct an d 
epujvtabl e . ■: ^ ?:s-s .v... . .... 

End ing o f the type -en . 



= Thi s, prefi x i s i denti fi ed, as prefix. number 25 
(vor) . ' " ' ' 



The £e ri oc[ is marked as being a marginal 
symbol, i . e . i t cons ti tutes the boundary 




III ' ' ' 



- paradigmatic class - Is dropped because it is no longer 
relevant.) In addition, it has the features joerson and 
number wh i ch mark i t as either 3rd person singular or 2nd 
person £lural. (The apostrophe and slash establish this 
relation between the individual features) It is also 
marked as: tense present , voice active, and mood. 1_ndi catl ve . 



ADJ 




I N ( W ) 



/ 




= With respect to cjender and case , the inflected 
ad j ec ti ve is characterized as masculine 
nominative; or feminine or neuter nominative 
or accusative. In num ber it is singular; 
the in flection is weak. 



NO 

GD 

CA 

NU 

TY 



(F) 

(NjG.D.A) 

(QU) 



i . e . it is 



= The inflected nominal has the same gender 
and type information as the dictionary 
noun entry and in addi ti on f ’has the tags 
n umber s i ngul ar , case 4-way ambi guous . 
ei the r n^omi na ti ve , £en 1 ti ve , dati ve, or 



accusative, depending on its environment. 



NO T : > • 

GD ( N ) 

CA ( D ) 

NU ( P ) 

TY ( AB+CN ) 



Inflected nomi n a 1 w i th the (jen der and type 
of the underlying noun stemy cas e dative, 
number plural . 



Ph rase l evel 
NP 

gd(f) 

CA ( N , A ) 

N U ( S ) 

TY QU) 
ps(3) 



The n_oyn phrase has the gender, ca se, and 
numb e r ch aracter i s t i e s i n w h i ch t h e u n d e r- 
1 y i n g determi ner » adjecti ve and noun agree , 
namely femi ni ne homi native or accusative 
singular; the typ e is that of the head 
hpun; the NP is marked as 3rd £_ers^oni 







\ 




NP 

GD ( N ) 

CA ( D ) 

N U ( P ) 

TY (AB+CN ) 
PS(3) 

PRPH 
P R ( 2 4 ) 

TY (AB+CN ) 
NU(P) 



Noun 2 _h rase with all syntactic and semantic 
features of the underlying nominal, 
identified as 3rd person. 

This pr eposi tional ph rase is identified as 
dominating preposi ti on 24 , i.e. von , ad an 
NP with a head noun of t^pe ab stract and 
countable, number jjjlural. 



NP 

GD(F) 

C A ( N » A ) 

NUCS) 

TY (AB+CN ) 

PS ( 3 ) 

features 
and countable. 



= This noun johrase, which dominates an NP 
followed by a von PRPH, has the syntactic 
features of the dominated NP : 
cjender feminine, case noml natl ve or 
accusative, number s_i ngular , and the semantic 
of the head noun of the dominated PRPH: type abstract 
It is also marked as an NP in the 3rd joerson. 



Clause and sentence level: 



TY f I N V ) 
TN(PR) 



= This cl ause is of the ty pe with i n v erted 
word order ; it may be followed by a 
to form a question or, as in this sentence, 
it may be preceded by an expletive es_ to form a declarative 
sentence; its t_ens e is p r esent. :-r . 



CLS - A clause of type declarative, tense present. 

TY ( DC) — “ 

tn ( pr) . -• V' ^ .. ■; 



SNT - A sentence of type declarative, 

TY ( DC ) ~ “ ! : l 










366 




LEXICAL FEATURES IN TRANSLATION AND PARAPHRASING: AN EXPERIMENT 



by 



Rolf Stachowitz 





Linguistics Research Center ; 
The University of Texas at Austin 




LEXICAL FEATURES IN TRANSLATION AND PARAPHRASING: AN EXPERIMENT 



I Introduction 

It is obvious to any user of a monolingual dictionary that 
the meaning of a lexical item is not only dependent on the 
external form of the item but also on its syntactic or semo- 
syntactic properties."^ The terms homonymy and polysemy- reflect 
this knowledge. It is equally obvious for the user of a better 
than average bilingual dictionary that the meaning of a lexical 
item is also a function of each selection restriction associated 
with it. This observation is evident from the fact that differ- 
ent translations are associated with a particular lexical item 
dependent on the syntactic and/or semantic properties of the 
constituents in its environment. The verb e.A.'Cnne.A.n provides 
an example for German: In the environment ’’reflexive pronoun" 

its translation is n emembe-'t; in the environment "non-reflexive 
object" its translation is Aem-thd. 

The observations are* of course, true for lexical items in 
a language independent of their trans latahili ty into some other 
language. Only a few monolingual dictionaries, however, make 
this observation explicit. Among the few notable examples are 
the German Koerterbuch der deutschen Gegenwartssprache ^ and ; 

Ho rnby ! s An Advanced Learner * s Dictionary Ho rub y lists for 
each verb the complement structures with which it may occur 
and the meanings it has in each environment . .Thus , ob^cA.va 



367 



ERIC 



49 












■ . . .. V ■■ry--: . 



-V-.r:' : V 









comment: in 



may mean to take. nottcp, o ^ (to wa-ta/i) or to a& 

the environment "that 5", e.g. He ofaieAued that ht& vot^e had 
a/isitved. However, in the environment "NP" , obieAue can only 
have the first interpretation, e.g, Hz ob&eaved tkz at^itval o & 
kt& wtfie^ , 

In view of the possibility of specifying the meaning of a 
lexical item or selecting a proper translation equivalent for it 
by taking its environment into account, it may seem surprising 
to the uninitiated that earlier MT systems had attempted to make 
such selections based on different criteria: considerations of 

the type of text to be translated or of probability of occurrences 
of lexical items. The difficulties confronting attempts to 
access the selection restrictions of a lexical item during the 
surface analysis of a sentence by means of a context-free grammar 
have been described in various monographs. These difficulties 
are multiplied when attempting the translation of languages, 
such as German, where various agreement and government relations 
hold between constituents, where lexical items and phrasal ex- 
pressions often occur as discontinuous elements , and where 
sentence constituents can occur in various orders. The attempt 
to incorporate selection restrictions of lexical Items into 
non-terminal symbols of context-free grammars would have increased 
the number of such rules to unmanageable proportions .For this i 
reason, the incorporation of such selection restrictions was 
consequently suppressed. ; : ;The loss was two -fold : l'.;, .yC&iy ;■:> -ii 



368 




a) The number of syntactic interpretations for a sentence 
often increased ("forced readings”). 

b) The selection of proper translation equivalents had 
to be based on different criteria. 



II Background of the Experiment 

In summer 1966 I began investigating the possibilities of 
improving various parts of the Linguistics Research System® 
in order to cope with the increasing difficulties encountered 
in the attempts to analyze and translate sentences in natural 
language: the prohibitively large number of syntactic and 

translation rules necessary for the description and translation 
of surface structures into surface structures and the inability 
to deal with discontinuous constituents.® The research was 



influenced by the following guidelines : 

1) to improve translation by permitting access to selection 
restrictions ; 

2) to decrease the number of forced readings assigned to 
sentences without an unreasonable increase in the number of 



grammar and translation rules; 

3) to preserve as many as possible of the various algorithms 
used for surface analysis, translation mapping and surface pro- 
duction . yV -v : ... i i'if ’• '->>>•'' C-- ■. V 

The results were reported in December 19 66- in an unpublished 
paper which s tated : r-; yiy .yyy.y ‘ 

a) that vastly improved translations were possible' by 
performing translation not from surface structures into surface 





structures but from standardized surface structures (standard 
strings) into standardized surface structures ; 



b) that these standard strings could be derived from the 
syntactic reading of a sentence by means of an additional straight- 
forward algorithm; 

c) that these translations could- be obtained with an 
overall decrease of grammar rules; 

7 

d) that the core of the LRS algorithms could be retained* 



e) that non-trivial paraphrases could be performed over 
standard strings which were not possible over surface strings. 
An experiment was subsequently performed to compare the 
proposed translation procedure with the established one. In 
order to facilitate this comparison, a text was selected for 



translation part of which had been translated in February 1966 
using the Linguistics Research Center’s first and second order 
translation system. Since the program which derived the stand- 
ard strings from the corresponding sentence readings did not 
exist, the standard terminals were represented as surface 
terminals enclosed in asterisks. Only in cases where surface 
terminals occurred as homographs in the given text was a descript- 



or added in parentheses to reflect the disambiguating effect 
of the standardization procedure . >. ; • ' 

In order to reduce the time spent on this experiment, only 
one standard string of those sentences which had more tharione 

: ••••' • -V. - - .'/■ »-w, f; . ■ ■ - - ■' v- •" 

surf ace reading was selected. -(The, number of Readings for 




sentence 486 was 24, sentences 488, 489, and 492 had two read- 
ings each, all others had one.) 

Ill Standard Strings 

The standard representation of a sentence is a reordering of 
xts terminal elements (with their part-of -speech interpretation) 
based on the surface interpretation of that sentence. The re- 
ordering could be performed by means of ordering instructions 
assigned to each constituent in the consequent of a rule which 
is part of the sentence reading. 8 

Assume the sentence Hz tookzd thz wo-td up is analyzed by 
the rules represented in the following tree diagram! 



S 




(The digits at the end of branches determine the mapping order 
of the sister nodes) , 

The standard string corresponding to this reading would 
then be: 



fee ed 

<PRN> <END> 
<PAST> 



Zook up ihe, WQH,d 
<V> <ADPREP> <DET> <N> 



where the part -of - speech interpretation - of each terminal is 
represented in angled brackets. (One can obtain a standard 
string by tracing down from each node, beginning with S, all 
branches in their indicated order and not tracing up a branch 
before all terminals below that branch have been reached) . 

The following standard order was defined for German surface 
constituents : 



For clause level elements 
Subject (of an active 
passive sentence) , 

(of a passive sentence) , 
adverb lals. 

For phrase level elements 



, agent adverbial (of a 
, direct object, subject 
complement , indirect object , 



Verbals: Finite verb, non r finite verb, prefix. 




Noun phrases : Head, post -modifier , pre-modifier, determiner. 

Prepositional phrases: Preposition, object. 

For word level: Affixes , stem . 

Con j oined elements "A, B and C" : and , A B C * 



3 72 




.he standard order defined for English differed fro, that 
or German only in that the element., of noun phrases occurred 
m the sequence: Determiner, pre -modifier , post-modifier . 

-ead of noun phrase. No significance is to be attributed 'to 

th dl ^ IerenCe ’ the distinction was made primarily to facilitat, 
e reading of the output, the English standard strings. The 

istinction, however, shows the independence of the standard 
orders of the two languages. 

T he greater ease with which 5tr . ngs ^ ^ 

order cou ld be anal yze d may be evident ^ 

CtlC deSCriPti ° n ° f following five sentences with the 
corresponding standard descriptions 



1) Va^Baak ka.Jt e.tL ^ elnett F^qq gegeben. 



2) hat e * : da±_BucM gegeben. 

3) Vjji Fiau Ut en. geiotgt. 

4 ) je^nex f^au hat ett gekotiekt. 

5) Va6 Bueh kat eti gele^en, 

(Clause level constituents consisting of more than one word 

are underlined) . These sentences were analysed by the follow 
mg rules : ■ ' "v 

l n S **" ACC H UX f UBJ ‘ 0BJ pastpart 9 
aul h 3 dat h . 

3 SG : ACC ; -iV 

: bC ’ DAT 



2 1 ) 



sry 



5 “ Sat h UX f UBJ 0B -’ pastpart 

■ UA1 H 3 ACC H 

|p SG ACC 

ab DAT 

S * DAT s“ X f UEJ PAST PART 



4*) 


S -> OBJ 
DAT 


AUX 

H 

3 

SG 


SUBJ 

3 

SG 


PASTPART 

H 

DAT 


S') 


S OBJ 

ACC 


AUX 

H 

3 

SG 


SUBJ 

3 

SG 


PASTPART 

H 

ACC 



As we can observe, each change in word order (sentences 1 
and 2) , syntactic agreement (sentences 3 and 4) or government 
(sentences 4 and 5) had to be analyzed by a new sentence rule. ^ 
The corresponding standard representations, however, permitted 
a far more economic analysis. 




3" S 





/ 



5” S 




\ Firstly, it will be noticed that permutations as in sentences 
1) and 2 ') were reduced to the same representation. Secondly, 
it was possible to concatenate the verb with its immediately 
contiguous elements, dropping with each concatenation the in- 
formation that. was necessary, for the concatenation. This re- 




sulted in a considerably smaller number of grammar rules. 

Note that all four readings have in common the rules S -> SUBJ 

••••'■.' •" ••••..•••••••• = •' . -v-v ^ -z : 3 .3 

SG SG 

and VP -> END V. Sentences 1) , 3) and 4) also have in common 

3 3 ' . .. . •••. • a • .. 

SG SG ' H 

the rule V V .OBJ . It was , finally , possible to treat 
DAT DAT ; ■ . 

discontinuous lexical Items as one piece and assign them a new . 
their correct, syntactic interpretation. 12 thus the rule , ^ 



S OBJ (4) . PPEDC2) 
ACTIVE 

the constituents i s 






such as 

• *. ...r .. 



SUBJ (1) PRFXC3) the desired order., of 
given m parentheses - interpreting sentences 










H -V-kV r v^ : 



■I.v - : . L . 






6) 94.e&e AKb ett Atettten A-te e<L n - They d4,6 aonttnued tht6 woKk . 

7) Q4.e,6 e Looting tehnte eft ab = He Ke.je.atzd th.4.6 6olut4.on, 
generated the standard strings given in the' tree diagrams below. 13 




" S 




IV The Selection of Translation Equivalents 

The possibility of associating. more comprehensive syntact- 
ic information with lexical pieces in standard strings as a 
consequence permitted an improved selection of translation 
equivalents. The list in Figures 7-1 through 7-6 contains a 
number of German items with their selection restrictions and 
the particular translations associated with each selection re- 
striction. The lexical items are listed in the order in which 
they occur in the'' translated text. The selection restrictions 
which apply to the text are given a check mark. No semo- 
syntactic features, like HU, AN, AB (human, animate, abstract) 
were taken info account when performing the translation; for 
those features , c£. my appended paper ’’Requirements for 
Machine Translation: Problems, Solutions, Prospects.” 

The translation possibilities which resulted from the performed 
subclassification are indicated by light broken lines’, the 
ones selected) by heavy underlines Of particular interest 
is one of the translations for ge.Z<Cng &n (sentence 494 9 Figure 7) 
which permitted the mapping represented by the following diagram 




+ unit of measure" could be mapped into "uu.de + unit 
of measure" or "unit of measure + Zn , Zuo/tcLnung zu, into 

h. efa-t-co n -to or co nne.c£Zo n lo Z ih , The noun phrase Zang e Zait 
could be recognized as an adverbial of extension in time in- 
stead of as an object due to the feature TI.M, 

V Paraphrases 

In order to show the variety of translations or paraphrases 
possible over standard strings, a number of non-ad-hoc systematic 
synonymy relationships were defined for English resulting in 
the paraphrases given in Figures 3 and 4, Synonymy relationships were 
defined between lexical pieces and between syntactic structures. 

Examples of the latter are the active : passive transformation, 
the perfect tense : past tense transformation 1 ^ and the noun- 
pre-modifier ; noun-post -modifier transformation. Trivial 
examples of lexical paraphrases were simple synonymy substitutions 
like get ■* obZaZn, p/ionUmncz : pA,o tub glance , or oZ/iaZe. •* siZngs l 

less trivial examples were tuna* t moon, 4 oZaK &un, ZumZno u& .* • \ 

Z-tg ht , bxZg kt • ( to ) & hZne,j manage, to ( + infinitive) •* succeed tn I 

(+ gerund) . The effect of the syntactic classification of lexical 1 

items which had been defined as synonymous resulted in a select- 1 

ion of only those syntactic superstructures which interpreted ■ 

them. Thus syntactic superstructures which were -interpreted 
by the same normal form expression . (translation term) but which •; ' ' ; 



could not form a well -formed .tree with the selected lexical /. . ' ; 

r , items were filtered out during the production phase f 1 ^ The 'i 




effect of this filtering function is shown for two examples 
in Figure 6; the sequence of normal form expressions S108, S10Q, 
S10 8 , SI 04 , L176, S104, L125 (to be read from top to bottom, 
left to right) simultaneously represents the four paraphrases 
-the 4>o£aA. d-C^k, -the d-t4 fe. ofi ike. s un, .the, 4 un *4 chc4 k „ ike, &u,n 
dl& k . 17 



VI Translations 

The simulated standard representation of the German original 
text (Figure 1) is given in Figure 2, The computer output, the 
mechanical translations, is shown in Figures 4-1 through 4-9. 

The translations in Figures 5-1 through 5-3 show an approxima- 
tion to English normal word order, A more precise rendering 
would have required a separate processing stage, a rearrangement 
part. This stage seemed unnecessary for the purpose of the 
experiment since it is a simple reversal of the generation of 
standard strings from surf ace strings , A surface representation 
of the English translations of the German corpus is given in 
Figures 3-1 through 3-2. 

The translation was performed using some of the then exist- 
ing LRC analysis and translation algorithms .‘/These , in order 
to speed up the actual processing time , stored in core all read- 
ings found. Whenever the numberSof readings exceeded the -space 
allotted for them , certain readings were irretrievably dropped . 

If those readings were needed during the production phase , the 
corresponding German lexical or syntactic structures were used . 



instead. This effect is noticeable rim the occurrence of 
asterisked items in the English translations (also items, 
given in script in Figure 3) , in the occurrence of the German 
standard order in noun phrases” which is different from the 
defined English standard order, or simply in the ungrammat icality 
of the generated sentence. 



VII Conclusion 

In spite of the improved translation capabilities through 
translation over standard structures, the number of rules 
necessary, using context-free grammars with simple vocabulary 
symbols, was felt to be unnecessarily high. The changes made 
to remedy this deficiency are described in Lehmann/Stachowi tz 
1970, Vol, II. 












381 












. . T'. 






me 






Ora Al- 



■ L'.Vv . 



V-:: '-..-7;..-.. K* : v ' ■■ • ■ : 

. 



/ 



I 

f 

\ 



FOOTNOTES 



1 Thus the meaning of the noun man is different from that of the 
verb man, the meaning of the 'non-human*' noun ao nductoa 
different from that of the ’human* noun, 

2 Woerterbuch der deutschen Gegenwartssprstche , herausgegeben 
von Ruth Klappenbach und Wolfgang Steinitz, Akademiever lag 
Berlin, 1968 ff. 

3 An Advanced Learner * s Dictionary by Hornby, Gatenby and 
Wakefield , London , Oxford University Press, 1948. 

4 This nominali zat ion of the that- clause can be interpreted 
as a counterexample to various claims: 

1) The combined claim that transformations are meaning- 
preserving and nominali zations are derived transformation- 
ally from sentences; 

2) that semantic interpretations apply to deep structures 
before non-lexical transformations have applied. 

Other verbs which behave like ob^eave are aemaak and 
notice. Note that watch, cannot occur in the environment 
"that 8 ". 



5 A comprehensive statement on the algorithms of the 

Linguistics Research System as used until May 1968 is 
given in Chapter VIII of Final Report, Linguistic Informa- 
tion Processing Study , BA 36-039 AMC-2162 (E) , 1 May 1965 - 
3"0 April 19 66 ; and Dynamic Adaptive Bata Base Management 
Study , DA 28-043 AMC - 0 2 2 76 CBJ , 16 “ May 1966 - TTS May 1967, 

The University of Texas, Linguistics Research Center, Austin, 
Texas , November 1968 . 



6 



A comprehensive description of the problems encountered 
can be found in Lehmann/St achowit z : Research in German - 

English Machine Translation on Syntactic Level , Vol. II, 

The University of Texas at Aust ini August 1920 . 

Research performed during Spring of 196.8 has led to the 
design of > completely new analys is and translation algorithms 
which process: ; context -free grammars ; with complex terminal 
and non-terminal symbols Cf . sLehmanri/Stachowi t z 1970 
and the appended paper "Requirements for Machine Translation 
Problems , Solutions, Prospects.” 



Constituents in a rule consequent were assigned a prede- 
termined order to permit the translation of sentences Who s e 
const Ituentscouldoccurindiffe rent surface o rdersV -e.g- 
Maak b ewund eaten &te - Ste b ewundeaten Maak = T hey; admtaed 
Maak . ""V,-- v; ;v" -r-yv : 'T •7’ 




> The LRC verb dictionaries only contained descriptors per- 



raining to p3rs,digniEtic information ■ The voi'b constitusnts 
in those rules thus did not contain the descriptors per- 
taining to case government or auxiliary agreement information. 

t 

10 A trivial improvement for rules 1 1 and 2 » , resulting 
from the concatenation of the participle with the con- 
tiguous object before concatenating the new constituent 
with the other sentence constituents, was not possible 
in the earlier LRC system due to the ordering instructions 
attached to each constituent. Cf. Lehmann/Stachowitz , 1970. 



11 The affixes are actually represented by "dummy" terminals : 
these are again replaced by the proper affixes during the 
output phase. Cf . Lehmann/S tachowi t z 1970. 

12 The translation of verb -prefix combinations, which occur 

discontinuous ly in German main clauses, would have re-- 
quired sentence rules in which the actual prefix would 
have had to be mentioned as a feature o£ the constituents 
involved, For ekample, e lo e.6 ung 4 a/i£ug vosl (ff& 

6a .tux, .6 .6 otutto ft } would have had to be analyzed 
by a rule containing as constituents: 



13 



.14 



15 



OBJ PRED 
ACC. VOR 
ACC 
3 

SG 



SUBJ PRFX. 
3 VOR 
SG 



Each change of prefix would have 



required a new sentence rule, e.g. e Lo e.4ung nahm e/t. an 

ine accepted tfex.4 4otu.£<Lon)\ 



OBJ PRED 
ACC AN 
ACC 
3 

oG 



SUBJ 

3 

SG 



AN 



Such rules, of course, were never 
written. 



pare the translation equivalents z*.n*£e.JLJLzn = 
abZe.hne.n ~ Ae.tfu4 a in contrast to the translation 
corresponding s imp 1 e verbs 4 te££ e» = put, tefmen 



4 44 pend, 
of : the 
£ean . 



cas ® s where the actually performed sub cl as s if i cation 
• i suffice to distinguish between different meanings 

or an item (e.g, e>t fiatteft with the readingsv;pA.eA e/tve, 
777ax.utax.u vs . A-ecex.ve , , the translation given in 

the February 1966 translation was accepted. Cf . also foot* 
note 21 . r . * . . 

This paraphrase was defined to permit the translation of 

‘ 383 4 / ' 



me 









the German perfect tense as in sentence 492 into both 
English present perfect and past tense. 

16 One can interpret a sequence of normal form expressions 
as instructions to generate a tree by attaching the top 
node of a substructure to a non-terminal node of another 
structure, provided the respective labels are identical. 

The sequence of normal form expressions interpreting a 
tree thus imposes a well-formedness condition on the 
construction of all sentence trees with that normal form 
reading. C£ . also MeCawley 1968. 

17 The letter S stands for "non- lexical (syntactic) tree", 
the letter L for "lexical tree". The numbers were assign- 
ed in ascending order beginning with 100. These express- 
ions can, of course, be replaced by meaningful expressions 
which can be interpreted as the vocabulary symbols of an 
interlingua or universal grammar. 

18 The English subject 8, EdZan in sentence 494 correspond- 
ing to the German dative object appeared in the position 
for "indirect object" whenever a necessary structure was 
dropped. 

19 Figure 3: Only the paraphrases given in Figures 4-1 through 

4-7 are given here. The items in script do not occur in 

any translation; the items in parentheses were provided as op- 
tional translations . The repeated "optionality" of Zh<L is due to 
the fact that it was not provided as a lexical equivalent 
of German /der/ but supplied by means of a syntactic 
normal form expression which should have been based on the 
non-encoded information that some nouns may optionally 
occur without Zhe., like e.a.A,ik, ihe. e.a./iZh,. The equivalents 
aompZe.ie.Zy, wkoZZy, aniZ/iaZy , vasiy were not subclassified 
for adjective vs, participle modification (sentence 486) , 

Luminous ao/tona, (sentence 492) results from an incorrect 
rule . . 

20 Figure 7 : This translation , not given in any dictionaries , 

was provided in the February 66 translation. 

21 The selection. of the correct translation equivalent for this 
pattern depends on the understanding -of- the sentence. 



I 




Bibliography 



Bech, Gunnar , Studien ueber das deu tsche Verbum Infinitum, 
Det Kongcligc Danske Videnskabernes Selskab . Ban . His t , 
Filol, Medd. 35, no . 2 . Copenhagen, 1955; 36, no. 6, 1957. 

Bierwisch, Manfred, Grammatik des deutschen Verbs, Studia 
Grammatica II, Akademie Verlag , Berlin, 1963 . 

Chomsky, Noam, Aspects of the Theory of Syntax, M.I.T. Press 
Cambridge, 1965. 

Chomsky, Noam, Syntactic Structures . Mouton , The Hague, 19 S 7 

Gruber, Jeffrey S. , Studies in Lexical delations , M.I.T. , 
Cambridge, September 1965 , 

Harris, Zellig S. , String Analysis of Sentence Structure. 
Mouton § Co., The Hague, 196 2. — — 

Harris, Zellig S., "Transformational Theory", Language, 

41, No. 3, 1965. & *—* 

Hornby, A.S., A Guide to Patterns and Usage in English. 
Oxford University Press , London , i960 . “ 

McCawley, James D, , "Concerning the Base Component of a 
Transformational Grammar", Foundations' of Language , 

Volume 3, No . 3 , August 196 8 . “ ■ “ “ , ’ , 

Mes singer, Heinz, Langens cheidts Handwoerterbuch Deutsch- 
Englisch , Langenscheidt KG, Berlin-Schdeneberg , 1960, 

Postal, P. , Constituent Structure - A S tudy of Contemporary 
Models of Syntactic Structure , Publications of the ReLarc-b 
Center in Anthropology , Folklore , and Linguistics , Indiana 
University, Bloomington, 1964. 

Tesniere , Luc-ien , Elements de Sy nt a xe Structurale , 

Librairie^ C . Klincksieck , Pdris l 966 (deuxifeme edition revue 
et Corrigee) . *. - ; ' 




GERMAN CORPUS 

999.487 

DIE LINIEN DES WASSBRSTOFFS , DBS HELIUMS UND VIELER METALLE 
TRETEN HIER AUF . 

999.488 

WENN DIE MONDSCHEIBE DIE SONNE GANZ VERDECKT > ERSCHEINT EIN ROTER 
10 -- IS BOGENSEKUNDEN B REITER RING UM DIE SONNE. 



999.489 

DAS 1ST DIE CHROMOSPHAERE MIT DEN PROTUBERANZEN . 

999.490 

WEITER AUSSENSCHLIESST ALS SILBERWEISSER LICHT3CHWACHER SAUM 
DIB SONNENKORONA AN. 



999,491 

IN DER CHROMOSPHAERE FINDET MAN HAUPTSAECHLICH 
WASSERSTOFF- , HELIUM- UND KALZIUMLINIEN, ABER AUCH 
SPEKTRALLINIEN ANDERER METALLE. 



999,492 

IM LICHTE DER KORONA SIND MEHRERE HELLE SPEKTRALLINIEN 
AUFGEFUNDEN WORDEN, DEREN ZUQRDNUNG ZU BEKANNTEN ELEMENTEN LANGE 
ZEIT UNBEKANNT BLIBB, 



999 ,494 • • ■ -* - • .v: 

ERST IM JAHRE 1941 GELANG ES B . EDLEN IN UPSALA DIESE 
SPEKTRALLINIEN IN GEE IGNETEN IRDISCHEN L ICHTQUELLEN ZU ERHALTEN, 



999,486. . •; . : ‘ V.;- . v V; ; ' 'V- 

DIE HBLLEN LINIEN DER DAMP FFOERMI GEN SONNENATMOSPHAERE KANN MAN 
IN DER SOGENANNTEN UMKEHRENDEN SCHICHT , ' EINER SGHMALEN DAMPFHUELLE 
OBERHALB DER AEUSSEREN SONNENBEGRENZUNG, DER PHOTOSPHAERE , FUER 

EINIGE WENIGE AUGENBLICKE BEOBACHTEN, WENN BEI EINER SONNENF INSTERN IS 

. .. ,, ' - ’ . - - ■ ' - ’ - • ■- ‘ ‘ ' ■ ' . ■ : • - ' ■ • , ‘ . • • „ ' 

— - - " RANH v s 





German Standard Strings 



JOB PROOF G-TXT RETRIEVAL OF 30 . JANUARY ,* 67 
PAGE l 



COOL 999, 486 ,RST ,011867 

COO 2 * MAN * ** *K ANN * * GN* »BEOBACHT* *im« »I_JNIE* *ATM0SPHA6RE* *N* 

CC03 * SON N E * ** » EN * *DAMPFF0£RMI G * * * *DER* * * *EN* *HELL* ** *016* 

0004 * I N* *5CHIGHT* *,* •HUELLE® *OAMPF* * • *GBERHALB* *BEGRENZUNG* 

0005 *N * * SONN E * • * * *EN* *A6USSER* ** *DER * **.*,*■ * PHOTOS PH AERE « *DER* 
CO C 6 *,* *EN* *5CHMAL* ** *EINER* ' *EN* *END* *KEHR* *UM*(PFX) *EN* 

0007 *SOG ENANNTf ** *DER* ** *FUER* *E» *BLICK* »N» *AUGE* ** *6* 

0000 *ViEN I G* *E* *EINIG* ** ** *WENN* *MGND* *6* *END* *SCKREIT* *FQRT* 
0009 ** » D 6 R * ** »T* * LAGS 5* *FRE I * *RAND* *OBERFL A6CH6* *N* *SGNNE* ** 
COLO *CER » »» *EN* *SCHMAL* * GANZ* ** *6 INEN* *AUF* I PP ) »SG I TE* *EN* 
SOIL * E IN • ** *CER » ** *NOCH* *EBEN* #GERAD6* •* »BE I * *F I NSTERN I S *. *N* 

0012 *SONNE* ** *E INER* *» *1* *UM* ( FLX) •SPiKTR* *FLASH* ** *SOG.* ** 

0013 *}**#*,**•*.* 



0001 959, 487, RST, 011867 

0002 *N* * L I N I E * *UN0* *,« *S* *WAS5ERSTOFF* *OES* *S* *H6LIUM* 

0003 * DES * «E* *M£TALL* *ER* *VIEL* ** ** *OIE* ** *EN* *TRET * *AUF* 
C C 04 *H I E R » ** ** *,* 



0001 959, 480, RST, 011867 

0002 *R ING* «»* , *ER* *BREIT* *N* -SEKUNDE* »BOGEN* *# * *10* 

0003 *15* ** * ER * * RO T * ** *EIN* *# *T* >ERSCHpjN* *UM* *SCNNE* *DIE* 

0004 #* *W6NN* * 5CFE I B 6* * MCND * ** *OIE* ** *T* *Vt'ROECK* *SONNE* 

QCO*i *GIE* *GANZ* *• **»,* ** *** 



C001 999, 489, RST, 011867 

0002 . *OAS* ( 0 ) * * * I ST* *CHR0M0SPHAER6* *OJE* #MIT* *EN» *PROTUBERANZ* 

0003 •DEN* ** ** *.* 



0001. 995,490 ,R ST, 01 1867 ■' i.ri: vj.-i ''-y;.’ ;• '•! 

0002 * A * *KORGN« *N* *SONNE* f**DIi* ** *T* *SCHL I ESS* *AN* *AL.S* 

000 3 *SAUM* *£R* *L I CHTSCH WACH* #ER* * S I LBERWE ISS * ** *AUSSEN* *6R* 
GC04 *W£IT# ** ,*«;*.* ,. . 



0001 959,491 , RST ,G1 1867 

0002 *MAN * *.*- * g T * *F I ND * 



C003 * WASS0RSTOFF* * — * *H6L I UM* *K ALZ IUM* ** 
0004 *AL* *SPEKTR* #E* * M E T AtL * * E R * *AA(D ER # 



0005 



►0 6R* 



*, ABER AUCH* *N* *LINIE* *UNO« *,**-* 

HAUPTSAECHL ICH* *N* *LINIE*. * 
* ** * I N * »CH ROMOSPHAER E * 









CCOl 999, 492, RSX»0 11867 



■-V-. 












0C02 



vr + i 



* *- * S I ND * * W ORDE N* «G£* * 






► FUNC * 



CO 03 *SPEKTR* *£ * *HELL* *6 * § *M EHSeR * * 5 ^ I ** * E * * * LICHT* 
0 CC4 *OER* ** *M» ** * * * Z U O R 0 NUN G* * ZU* *EN* • ELEMENT* ^ 






*AL* 



CC4bV- ** *ZUaRONUNG« *ZU* *EN* *EL6MENT* *EN, 



•KORON* 






^bekannt* 





Fi gure 3-1 



English Paraphrases of. German Corpus in Surface Representation^ 



487 Lines of (the) hydrogen, (the) helium and many 

metals »e^" occur here . 

^^•appear- — 



488 When ("the) 



-lunar disk 
disk of moon- 



-hides 

covers 



sun 



^completely- 

■•^—wholly 
* 



■entirely 



red- 



■ring. 
circle 




IS 



■arc seconds 
■seconds of arc 



-in width 
wide 



■appears around 



489 

490 



(the) sun. 

This is (the) chromosphere with (the)- 



-prominences 
'protuberances ■ 



(The) 

dim< 



■corona of (the) sun. 
solar corona- 



■follows a silvery white 



- border. 

- boundary- 

Above all 
491 Mainly- 
Chief ly- 



-f arther out. 




hydrogen's, helium's and calcium's 
en , helium and calciumi 



lines, but also 



■other metals 

•spectral lines of other metals 
are found in (the) chromosphere. 

Qne finds . . in (the) chromosphere. 




^*S#*-* Ay 1.±> ■■ V 









Fi gure 3-2 



494 



Only in • , 

--before 



'Not 



-unt il- 



succeed in getting 
manage to 



1941 did B. Edlen in Upsala 
’ — s.these spectral lines in 



suitable c 
B. Edlen in Upsala 



- terrestrial- 

- earth 



-luminous . 



light 



sources 



managed to< 
succeeded in 



obtain , 

get— - 

— - obtaining 
getting--- 



these 



.only in- 

, —-before 

not until 



1941 



486 One can observe the.' 



bright 

--shining- 



-lines o & the vaporous 



■ sun---. 
solar- 



-atmosphere in the so-called reversing layer. 




completely-^ 
wholly—— ' 
entirely^ 
very-— 



narrow 
thin 



.vaporous 



co at-- 
veil — — 
envelope 



above 



the outer solar 



■border 
boundary 







99487CC 1 
‘3940700 I 
99487001 



9 9.4 0 70 C l 
994 8 70 Cl 



OF HYDROGEN, HELIUM, AND MANY S METAL ES LIN OCCUR HERE , 

OF HYDWtJOEN* HELIUM, AND MANY S METAL ES LIN APRiAR HERE , 
OF THE HYDROGEN, HELIUM, AND MANY S METAL ES LIN OCCUR HERE 



9 94 870 C 1 OF THE HYDROGEN, HELIUM, AND MANY $ METAL IS LIN APPEAR HIRE , 



99487Q02 HERE HV0RDGEN » THt HELIUM, AND MANY S METAL ES LIN APPEAR 



OF HYDROGEN, HELIUM, AND MANY S METAL ES LIN OCCUR HERE 



** * - * 



OF THg HYDROGEN , HELIUM, AND MANY S METAL ES LIN OCCUR HERE 



99487C01 OF HYDROGEN, HELIUM, AND MANY S METAL^ES LIN *D IE*. OCCUR HERE- 

I ■ , 99487CC2 ‘ “ ^ ~ *' * * " JV • •- * ’ ■ " " ■ ■ ' ’ ^ r • 




' . . ' . ; 'V.-. . .vr -.-Wise ^ ■ V;:-.- - 







994PHCC1 

994880C2 



994 HBOC 1 
9 9 4 8 8 0 C 2 



994H8CG 1 
99488002 



9948000 1 
99468002 



99488C01 

99480002 



99408001 

99488C02 

99488003 



994 8 BOG 1 
99488002 



99488001 

99488002 



99488001 

99488002 

99488003 



994880C1 

99488C02 

994880C3 



99408CC1 

99488002 

99488003 



99488001 

99488002 

99488003 

’ ' '• . '• '• 

99408601 

99488002 



A RED 10 TO lb ARC S SECOND IN WIDTH E CIRCL 
SUN WREN AR LUN DISK S COVER SUN LY WHOL > • 



S APPEAR AROUND 



A RED 10 TO 15 ARC S SECOND IN WIDTH E CIRCL 
SUN WHEN AR LUN DISK ES HID SUN LY WHOL , . 



S APPEAR AROUND 



A RED 1C TO 15 ARC S SECOND IN WIDTH E CIRCL 
THE SUN WHEN AR LUN DISK S COVER SUN LY WHOL » 



S APPEAR AROUND 



A RED 10 TO 15 ARC S SECOND IN WIDTH E CIRCL 
THE SUN WHEN A.R LUN DISK ES HID SUN LY WHOL t 



S APPEAR AROUND 



A RED 1C TO 15 ARC S SECOND IN WIDTH RING S APPEAR AROUND THE 
SUN WHEN AR LUN DISK S COVER SUN LY WHUL , ** *.'* 



RED 10 TO 15 ARC S SECOND IN WIDTH S RING *6IN* S APPEAR 
AROUND THE SUN #WENN* THE AR LUN DISK S COVER SUN LY WHOL < 



RED 10 TO 15 ARC S SECOND IN WIDTH S CIRCLE *EIN* S APPEAR 
AROUND SUN WHEN AR LUN DISK S COVER SUN ELY COMPLET , ** *•* 



•RING* RED 10 TC 15 OF ARC S SECOND IN WIDTH ** *EIN* 5 APPEAR 
AROUND THE SUN WHEN THE AR LUN DISK S COVER SUN LY WHOL , ** ».» 



•RING* *,* 10 TO 15 OF ARC S SECOND E HID *ER* »ROT* •* A ** 

APPEAR *UM* *SONNE* *DIE* ** *WENN* THE AR LUN DISK S COVER 
THE SUN LY WHOL *t* ** *-« 



•RING* •**10 TO 15 OF ARC S SECOND E WlD RED ** *EIN* S 
APPEAR AROUND THE SUN WHEN AR LUN DISK S COVER SUN ELY ENTIR * ** 



10 *15* 



• RING* , * * E R * * OR E IT *v<-f 0 F,. r A RC SECOND •' 

#* •ER* . *ROT* >* >ETN* S APPEAR AROUND SUN WHEN AR LUN DISK ES 
HID . SUN 6 L Y COMPLET' ^ ^ 

•RING* •»* *ER* -BREIT* OF ARC SECOND * — • . 10 *15* 

»• «E P* 7 . •ROT* ***6lN» S APPEAR AROUND SUN WHEN AR LUN 0 1 SK ES 



•RING* *t* *ER* WID ARC S SECOND TO 10 15 ** *ER* *KOT* ** 

A «» «T* * ERSCHE IN* »UM* •SONNE* *D IE* ** *WENN* THE AR LUN DISK 
S COVER SUN ELY COMPLET » * ** *•■* 






99409001 



THIS IS CHROMOSPHERE WITH S PROMINENCE 



99489C01 THIS IS CHROMOSPHERE WITH S PROTUBERANCE . 

99409001 THIS IS CHROMOSPHERE WITH S PROTUBERANCE - 

99489001 THIS IS CHROMOSPHERE WITH S PROTUBERANCE . 

994B9G01 THIS IS CHROMOSPHERE WITH $ PROTUBERANCE . 

99489001 TpIS IS CHROMOSPHERE WITH S PROTUBERANCE , 

99489001 THIS IS CHROMOSPHERE WITH S PROTUBERANCE . 

99489001 THIS IS CHROMOSPHERE WITH S PROTUBERANCE - 

99489001 THIS IS THE CHROMOSPHERE WITH S PROMINENCE . 

99489001 THIS IS THE CHROMOSPHERE WITH S PROTUBERANCE , 

99489001 THIS IS THE CHROMOSPHERE WITH S PROTUBERANCE , 

99489001 THIS IS THE CHROMOSPHERE WITH S PROTUBERANCE . 

99489001 THIS IS THE CHROMOSPHERE WITH S PROTUBERANCE . 

.99489001, THIS IS THE CHROMOSPHERE WITH S PROTUBERANCE , 

99489001 THIS IS CHROMOSPHERE WITH S PROMINENCE *.* 

99489001 THIS IS CHROMOSPHERE WITH S PROTUBERANCE ■*.* 

99489001 THIS IS CHROMOSPHERE WITH THE S PROMINENCE „ 



99489001 

99489001 



THIS IS CHROMOSPHERE WITH THE S PROTUBERANCE « 
THIS IS CHROMOSPHERE WITH THE S PROTUBERANCE . 



99 489001 THIS IS C HROMO S P H E R EtjjW I TH THE S PROTUBERANCE . 

‘‘ if i & *4* iX-i - <-.• v' - '\ i i ^ '''"V-.h ^ : : {<■' -‘i'J 1 ;*•’ ••«?•* ?. n ‘‘*3 : V*‘ •* I* ?Vi V'.’ V 

99409001 THIS IS CHROMOSPHERE WITH THE S PR 0 TUBER A NC E . || 







99490001 AK SOL CORONA S FOLLOW AS A SILVERY E WHIT DIM BOUNDARY 

99490002 FARTHER OUT . 

99490001 THE AR SOL CORONA S FOLLOW AS A SILVERY E WHIT DIM BOUNDARY 

99490002 FARTHER OUT , 

99490001 AR SOL CORONA S FOLLOW AS A SILVERY E WHIT DIM BOUNDARY 

9949CG0 2 FARTHER OUT *.* 



994900C 1 THE AR SOL CORONA S FOLLOW AS A' SILVERY E WHIT DIM BOUNDARY 

99490002 FARTHER OUT *.* 

99490001 AR SOL CORONA S FOLLOW AS A SILVERY E WHIT DIM BORDER FARTHER 

9949000 2 OUT ** #.* 

99490001 OF THE SUN CORONA S FOLLOW AS A SILVERY E WHIT DIM BORDER 

99490002 FARTHER OUT ** *.* 

99490001 OF SUN CORONA S FOLLOW AS A SILVERY E WHIT DIM BOUNDARY 

99490002 FARTHER OUT *. » :“1 . . I. 






’ -t, - . * w-- 



-99491CC1 ARE FOUND LY CHIEF HYDROGEN, HELIUM, AND CALCIUM ES LIN, BUI 

9 9 4 9 1 G C 2 ALSO OTHER S' METAL AL SPECTR ES UN IN CHROMOSPHERE . 

9949ICC1 ARE FOUND LY MAIN HYDROGEN, HELIUM, AND CALCIUM ES LIN, BUT 

99 A 91002 ALSO OTHER S' METAL AL. SPECTR ES LIN IN CHROMOSPHERE . 

99491001 ARE FOUND LY CHIEF HYDROGEN, HELIUM, AND CALCIUM- ES LIN, BUT 

99491CC2 ALSO OTHER S* METAL AL SPECTR ES LIN IN THE CHROMOSPHERE . 

99A9J.CCI ARE FOUND LY MAIN HYDROGEN, HELIUM, AND CALCIUM ES LIN, BUT 

99491002 ALSO OTHER S» METAL AL SPECTR ES LIN IN THE CHROMOSPHERE , 



99A9IGG1 ARE FOUND LY CHIEF HYDROGEN, HELIUM, AND CALCIUM ES LIN, BUT 

99A91CC2 ALSO 0 T HER S' METAL AL SPECTR ES L I N I N CHROMOSPHERE «.< 



99A910CI ARE FOUNO LY MAIN HYDROGEN, HELIUM, AND CALCIUM ES LIN, BUT 

994910C2 ALSO OTHER S* METAL AL SPECTR ES LIN IN CHROMOSPHERE *.* 



99491 001 *MAN* S FIND LY MAIN HYDROGEN, HELIUM, AND CALCIUM ES LIN, BUT 

99491002 ALSO OTHER S' METAL AL SPECTR ES LIN IN CHROMOSPHERE * "- 5 * 

99491001 #MAN* S FIND LY CHIEF HYDROGEN# HELIUM, AND CALCIUM ES LIN, BUT 
99491002 ALSO OTHER S* METAL AL SPECTR ES LIN IN CHROMOSPHERE *****.* 

99491CC1 * M A N * . S FIND BUT ALSO ES LIN AND HYOROGEN, HELIUM S 

99491GC2 CALCIUM LY MAIN OF OTHER S METAL AL SPECTR ES LIN IN 
99491003 CHROMOSPHERE •« *.* 

99491001 *MAN** S FIND . BUT ALSO ES LIN AND *,* S' HYDROGEN S* 

99491002 HELIUM S» CALCIUM LY MAIN OTHER S' METAL AL SPECTR ES LIN IN 

99491003 CHROMOSPHERE ***** 



994 9 1C 0 l *MAN* »* *ET* >FIND* BUT ALSO ABOVE ALL HYDROGEN, HELIUM, AND 
99491CC2 -CALCIUM ES LIN AL SPECTR E LIN OTHER S METAL ** IN 
9 9491 CO 3 CHROMOSPHERE **- ** *.* 

994 9 IOC 1 >HAN* ** *ET* *F IND* BUT ALSO *N» *L INI E* AND t *-* 



994910C2 * W A S S E R ST OFF* S HELIUM S CALCIUM *HAUPTSA ECHL I CH* OTHER S' 

99491003 METAL UM SPECTR ES LIN IN THE CHROMOSPHERE **:*, * 

- . ; - - ■ . '■ . ' -V :■ : : : ■ .■ : 1 ; ' 1- ; ■ v::. ' : - . - ' . ’ .. " 





99492001 

99492002 

994920C3 

99492001 

99492002 

99492003 

9949200 l 
99492002 
99492003 

99492001 

99492002 

99492003 

994920C1 

99492002 

99492003 

99492001 

99492002 

99492003 

99492001 

99492002 

99492003 

99492001 

99492002 

994920C3 



WERE ED DISCOVER SEVERAL INC SHIN AL SPECTR ES LIN *t« CORUNA 
S LIGHT *M * ** *« WHOSE RELATIONSHIP TO N KNOW S ELEMENT HO 
REMAIN UN N KNOW FOR A LONG E TIM t *.* 

WERE ED DISCOVER SEVERAL I NG SHIN AL SPECTR ES LIN *1* CORUNA 
S LIGHT *M* ** • * WHOSE CONNECTION WITH N KNOW 5 ELEMENT ED 

REMAIN UN N KNOW FOR A LONG E TIM , *.* 

WERE ED DISCOVER SEVERAL I NG SHIN AL SPECTR ES LIN *1* CORONA 
S LIGHT * M# ** * * OF WHICH THE RELATIONSHIP TO N KNOW S ELEMENT 
ED REMAIN UN N KNOW FOR A LONG E TIM , *.* 

WERE ED DISCOVER SEVERAL ING SHIN AL SPECTR ES LIN *[« CORONA 
S LIGHT *M» ** ** OF WHICH THE CONNECTION WITH N KNOW S ELEMENT 
ED REMAIN UN N KNOW FOR A LONG E TIM , *«* 

WERE FOUND SEVERAL ING SHIN AL SPECTR ES LIN 
CORONA ** *M* *« ** WHOSE RELATIONSHIP TO N Kf 

REMAIN UN N KNOW FOR A LONG E TIM ** *,* *«* 

HAVE BEEN FOUND SEVERAL ING SHIN AL SPECTR Bi 
LIGHT «M* ** ** WHOSE CONNECTION WITH N KNOW f 

REMAIN UN N KNOW FOR A LONG G TIM #* *,* *,* 



LIGHT *M* ** ** OF MHII 
ED REMAIN UN N KNOW FOR 



A LONG E TIM *• *,* »,* 



HAVE BEEN FOUND SEVERAL ING SHIN AL SPECTR ES 
LIGHT *M* ** ** OF WHICH THE CONNECTION WITH N 
ED REMAIN UN N KNOW FOR A LONG E TIM ** *,-» *.* 



IN 


OUS 


LUMIN 




3W S 


ELEMENT ED 




LIN 


IN 


CORONA 


s 


ELEMENT 


ED 




LIN 


IN 


COR CN A 


s 


KNOW 


S 


CLEMENT 




LIN 


IN 


CORONA 


s 


KNOW 


S 


ELEMENT 





9 9 4 9 2 C 0 1 
99492CC2 
99492003 

99492001 

99492002 

99492003 



99492001 

99492002 

99492003 



WERE ED DISCOVER SEVERAL BRIGHT AL SPECTR ES LIN IN OF CORONA 
LIGHT **' CONNECTION WITH N KNOW S ELEMENT WHOSE ED REMAIN N 
KNOW * UN* ** A LONG E TIM * * * » * * • * 



WERE ED DISCOVER SEVERAL BRIGHT AL SPECTR ES LIN IN OF CORONA 

KNO W S ELEMENT WHOSE ED REMAIN N 

ed 

ORONA L H 



LI 

KNOW 



CORONA LIGHT ** ** 

ED REMAIN N KNOW *UN 

9 949200 1 WERE ED DISCOVER 5E 

CORONA-LIGHT,** * 



A LONG E TIM ** 

ED DISCOVER ES LIN UM SPECTR BRIGHT SEVERAL 
RELATIONSHIP TO ELEMENT N KNOW *« 

** T I M k; long •» «, » .*.« ■ 



IN OF 
WHOSE 






99492002 
99492CC 3 



L^BRIGHT AL SPECTR,;ES LIN IN THE OF 
“' IP TO^’NvKNOW, S ELEMENT ** WHOSE ** 



*BLIEB* N KNOW *UN* **■ FOR A LONG E TIM ** *,* .*.* 

tviA'v 'T.-i ■: "'Jf-v ; r . vJk!-} 4: Vfh.if-i'V.'- *-V:“ >* - V- 7 •vVt* - - . - . • 





994940C 1 
99494 ZQ2 



994940" 1 
9 9 4 9 4 0 C 2 



994940C1 

99494002 



994940C 1 
y 9 4 9 4 £ 0 2 



994940CI 

99494002 



99494001 

99494C02 



99494001 

994940C2 

99494QC3 



9 94 94 00 I 
994940C2 
99494CC3 



99494001 
99494002 
9 9 4940 C 3 






0, EOLEN DID MAN AG 10 GOT THESE AL SPFCTR ES LIN IN SUITABLE 
IAL T EH RES f R OUS LUMIN S SOURCE IN UPSALA NOT UNTIL 1941 . 



B- EDL6N DID MANAG ID GET THESE AL SPECTR ES LIN IN SUI TABLE 
IAL. TERRESTR OUS LUMIN S SOURCE IN UPSALA NOT BEFORE L941 « 



B, EOLEN DID MAM AG TC OBTAIN THESE AL SPECTR ES LIN IN SUITABLE 
IAL TERRESTR OUS LUMIN S SOURCE IN UPSALA NOT UNTIL 1941 t 



B. EDLEN DID MANAG TO OBTAIN THESE AL SPECTR ES LIN IN SUITABLE 
IAL TERRESTR OUS LUMIN S SOURCE IN UPSALA NOT BEFORE 1941 , 



B* EDLEN DID MANAG TO GET THESE AL SPECTR ES LIN IN SUITABLE 
IAL TERRESTR OUS LUMIN S SOURCE IN UPSALA ONLY IN 1941 , 



B. EDLEN DID MANAG TO GET THESE AL SPECTR ES LIN IN SUITABLE 
IAL TERRESTR OUS LUMIN 3 SOURCE IN UPSALA ONLY IN 1941 . 



B, EOLEN DID SUCCEED IN TING GET THESE AL SPECTR ES I IN in 
SUITABLE IAL TERRESTR OUS LUMIN S SOURCE IN UPSALA NOT UNTIL 1941 



B. EDLEN DID SUCCEED IN TING GET THESE AL SPECTR FS .in in 
l-941 A 'i- Le IAL TERRESTR OUS LUMIN S SOURCE IN UPSALA NOT BEFORE 



ED SUCCEED IN 1 NG OBTAIN ES LIN 
SOURCE LIGHT EARTH SUITABLE ** m* 
ONLY IN 1941 ** *.*=?-'••.•••• 



UM SPECTR.: OETSERV IN - 
EDLEN IN UPSALA ** 



. v5v:-*v < ::v: V • < 



- ■ * . • - ' - 





PATHS ltl.. 1*2. .2*1. .2*2. .3*1. .3*2# 



12i 1. . 12» 2 



99406G01 
994B60G2 
994 8 60 C 3 
994 8 6CC4 
9 94 8 6C 05 
994B6CC6 
9 -J486CC7 



•MAN * ** CAM E OBSERV ES UN AR SDL S ATMOSPHERE «EN* DUS 
VAPOR «• *DER« *» INC SHIN •DIE* IN LAYER * * « OUS VAPOR S 
ENVELOPE MEYONO THE CUTER AR SOL BOUNDARY »* f PHOTOSPHERE • 
THIN A ING REVERS’ SO-CALLED *D6R* ** FOR A FEW S MOMENT 
WHEN I NO ADVANC MOON L S LCAV F: VISIBL A VERY THIN AR SOL SURFACE 
EDGE CN ONE E SID JUST LY HARE DURING A AR SOL DARKNESS , 
SU-CALLEC FLASH UM SPECTR* , ** *,* 



99406CC 1 
99486CC2 
99486003 
99486 CCA 
9 9 4 B 6 0 C 5 
99486CC6 
994B6CC7 



• MAN* *« CAN E OBSERV ES LIN AR SOL S ATMOSPHERE *GN* OUS 
VAPOR ** »DER* »* ING SHIN *DIE* IN S» LAYER *** OUS VAPOR S 
ENVELOPE ABOVE THE OUTER Aft SOL BOUNDARY ** , PHOTOSPHEKt 

NARROW AN ING REVERS SO-CALLED *DER* ** FUR A FEW S MOMENT 
WHEN ING ADVANC MUON ES LEAV E VISIBL A VERY THIN AR SOI. SURFACE 
EDGE CN ONE E SIP JUST LY BARE DURING A AR SOL DARKNESS , 
SO-CALLED FLASH UM SPECTR * *» *.* 



•MAN# #* CAN E OBSERV ES LIN AR SOL S ATMOSPHERE 499# OUS 
VAPOR ** *DER* ** BRIGHT »DIE# IN LAYER * . * OUS VAPOR S 

.. ENVELOPE BEYOND THE OUTER AR SOL BOUNDARY ** , PHOTOSPHERE , 

99486004 THIN A ING REVERS SO-CALLED *DER* ** FUR A FEW S MOMENT 

994B6Q05 •WENN* ING ADVANC MOON ES LEAV £ VISIBL A VERY THIN OF AR SOL 

99486006 SURFACE EDGE ON* SID ONE JUST LY BARE DURING A AR SOL 

99486007 DARKNESS » SO-CALLED OF FLASH UM SPECTR* ** *»* ** *,* 



9948600 l 
99486002 
994B6C03 



99486001 • MAN * *# IS *EN* OBSERV *N* *LINIE* ATMOSPHERE *N* SOL ** 

99486002 OUS VAPOR *• *DER* • ** *£N * SHIN •* *0 1 E * * I N * S LAYER * 

99486003 ENVELOPE VAPOR ** BEYOND OUTER AR SOL BOUNDARY *DER* ** * 

99486004 PHOTOSPHERE *, * THIN *E INER* *EN* *END* REVERS *EN* SO-CALLED 

99486005 ' ».** «DEU« -*• FOR A FEW S* MOMENT ** *WENN* INGAOVANC MOON 

99486006 *DER * ES LEAV E VISIBL VERY THIN OF AR SOL SURFACE EDGE 
99486007 *£ I NEN* ON *SE t IE* ONE JUST LY BARE ** DURING A AR SOL 

99486008 DARKNESS •* , THE SO-CALLED OF FLASH UM S PEC f R ,*##,#« # *«* 



99486001 

99486002 

99486003 

99486CC4 

994H6GC5 

99486006 

99486CC7 

99486008 

99486009 



• MAN# ** *KANN« *EN* *BEOBACH.T* *N* * L.I NIB* SUN S ATMOSPHERE 
■# 6 N *1;* D A MPFFOERMIG * . ft D E R # ** * E N * * HELL*2»* I* DIE## I N 
»SCHICHT» * > * OUS VAPOR VEIL «OSERHALB» OUTER AR SOL BORDER 
*UER* ** *EN* •SCHMAL* ** *6 INER* *EN» 

SO-CALLED *»*DER* **f^ FUR A FEW S MOMENT ** *WENN> LUN #E* 

• END* ADVANCVV# #OER* ES LEAV E V ISIBL A ELY ENTIR NARROW 0F::AR 

> »SF I TF» *PM » ONE ** *DER* ** JUST 
LY BARE DURING AN OF SUN. ECLIPSE * (i SO-CALLED' OF FLASH UH , 
SPECTR #1* «***•*«*.* 



99486001 

99486002 

99406063 

994&6aam 



► N* 



* s — 



*MAN» * * - * K A N N * *EN*; •BEOBACHT* *N* *L INI E* *ATMQSPHAER£* 

ON N E * * * *DAMPFFOERM I G» ft. * * D ER * * * * E N * * HE L L * ft » 

*D 1 E * * IN * H I CH T* *,* *H UE LLE* *DAM PF»***OB E R HALB* AR S 0 L 



99486C05 



BOUNDARY >en;* #AEUSSER;K 3*DER* #* 



99486006 

99486007 

99486008 

99486009 

99486010 



* > E N ft *5 C H M A L *i • * - *E INER** EN* . *END* . * EN ft * S 0 G E N A N N T * , * * 

*DER* ** *FUER* *E* .MOMENT * A. FEW ** * WENN* *MOND* *E* 

•END* *SCHREIT* *F0rV ( * ** *DER* -'** *T* *LAESS* *FRE I* OF AR SOL 







PATHS 1 *0 » U » 1-2, 0,0, l— -12,0,0,1-1,0 ,0,2- 2,0, 



0 , 2 - 



■12 , 0 t J p 2 



9 '5406001 
994 8 A0C2 
99486003 
99486004 
99406005 
99486006 
99486007 



♦ MAN 1 * ■♦* IS E OW5ERV ES L I N ATMOSPHERE SUN OUS VAPOR 

»# *OEK* ** OKIGHT ♦DIE* IN LAYER , ENVELOPE^*injS VAPOR 
BEYCNC BORDER SUN OUTER *DER* "»♦ i PHOTOSPHERE V THIN A ING 
REVERS SO-CALLED *OEK* ** FOR A FEW S MOMENT *WENN* THE ING 
ADVANC MOON E S LEAV E VISIBL EDGE SURFACE SUN *0ER* ** TH IN 
COUPLET ** A ON* SID ONE JUST LY BARE ♦* DURING ECLIPSE 
SUN A ** ", THE TO-CALLED OF FLASH UM SPECTR, *♦ ♦,« •• *.* 



99486001 
994860C2 
99486003 
99486004 
99486005 
99486006 
9948600 7 . 
99486008 



• MAN* ** * KAN N * , • E N ♦ OBSERV *N» L IN ^ATMOSPHERE *N* SUN ** 
VAPOR ♦*#DER* ** BRIGHT *DIE* IN LAYER ( COA T VAPOR 

ABOVE BO UN OAR Y T H E SUN ; QUTER *DER* ** , THE P HOTOS PHERE. 

NARROW AN ING REVERSE SO-CALLED *DiR*** FOR A FEW S MOMENT 
*WENN* ING ADVANC MOON S ESfLEA^^ 

•UER. .. NARROW . t.NrlR .. AN ON > SID ONE JUST LY BARE .. 

DARKNESS THE SUN AN ** , THE SO-CALLED OF THE FLASH UM 






• MAN *,** ^*K ANN*,-,* bN* *BE0BACHT* *N*fe#LlNIE* * ATMOS PH A E KF * 

H [ * * | * OUS VAPOR COA T *UBERtfALO* AR SCT S "" 

■)9<,B6=CA BUKOER *ENi -AEUSSER. .DER. .. . , . PHOTOSWHCRE . , i St ‘ l S 

- -AN. .SCHWAL, .. .C INEK. ,EN. .SOGENANNT. "er" I. .,-UER. 

*t* * G L I CK * *N* * AUGE * **-N,£* * WEN I G* . *E* , *E I N I G * ♦» •• 

- *Wf-NN» ■ T HP I NG'iAU VANC 4 ' ' 



994H6GG7 
99406GC8-: SURFACE 

• 'J 9 4 8 6 j 0 9 : T Hf : 







99490001 

9949CGC2 


AR SCI. CORONA S 
FARTHER OUT . 


99490GC1 
9 94900 C 2 


AR SOL CORONA 3 

our. 


99490001 

99490002 


AR SUL CORONA S 
OUT . 


99490001 
994 90 GC 2 


AR SOL CORONA S 
CUT . 


9949000 1 
99490002 


SOLAR CORONA S 
OUT . 


994900C 1 
9949CGC2 


SOLAR CORUNA S 
OUT . 


99A 9CQG 1 
994 900 C2 


AR SOL CORONA S 
FARTHER OUT *. * 


99490001 
9 94 900 02 


AR SCL CORONA S 

gut **;* 


99490001 
994900 02 • 

99490601 

99496002 


AR SOL CORONA S 
'OUTv. * 

•• . y ;• ■>' ■■ : ; ''vyr H 

OUT * • * ■ "Xv-. 



AR SCL CORONA S FOLLOW AS A SILVERY WHITE DIM BO UNO A R Y F A R I H £ R 

T # *# -v f ‘ > /•; • . .• j- 

AR SOL CORONA S FOLLOW AS A SILVERY E WHIT DIM BOUNDARY FAR I HER 



m,T° LAR C0R0NA S FOLLOW AS A SILVERY £ WHIT DIM BOUNDARY FAR I H£R 
90002 - - OUT ■* •* * •, , . 

90001 SOLAR CORONA' S FOLLOW AS A SILVERY WHITE DIM - BOUNDARY . FAhTht-h 



9949G0C2 










99492002 

99492003 



LIGHTS »M* #*'** WHOSE RELATIONSHIP TO N KNOW S ELEMENT ED 

REMAIN UN N KNOW FOR A LONG E TIM • *.* 



99492001 
99492002 
9 9 4 9 2 v C 3 



SEVERAL ING SHIN AL SPECTR LINES WERE 
LIGHTS » M * ** * » WHOSE RELATIONSHIP TO 
REMAIN UN N KNOW FOR A LONG TIME t •** 



ED DISCOVER *1* 

N KNOW S ELEMENT 



CORONA 

ED 



994920CI 
994 92 0 0 2 
994920 03 



SEVERAL ING SHIN AL SPECTR LINES 
LIGHTS *M* ** »* WHOSE CONNECTION 
UN N KNOW FOR A LCNG E TIM f *.* 



WERE 
W I TH 



ED DISCOVER * 1 * 
N KNOW ELEMENTS 



CORONA 
ED REMAIN 



9 9 4 9 ? C 0 I 
99 4 9200 2 
99492003 



WERE EL DISCOVER SEVERAL ING SHIN SPECTRAL ES LIN *1* 
LIGHTS *M# «* •* WHOSE CONNECTION WITH N KNOW ELEMENTS 
UN N KNOW FOR A LONG TIME »*.» 



CORONA 
GO REMAIN 



99492001 

994920C2 

99492003 



SEVERAL ING SHIN SPECTRAL ES LIN WERE FOUND IN OUS LUMIN 
CORONA ** «M* ** ** WHOSE RELATIONSHIP TO N KNOW ELEMENTS ID 
REMAIN UN N KNOW FOR A LONG E TIM «» »,«*.* 



99492CC 1 
99492002 
99492003 



several: 

L IGHT *M* 
REMAIN UN 



ING SHIN SPECTRAL ES LIN HAVE BEEN FOUND IN 
»*** WHOSE RELATIONSHIP TO N KNOW S ELEMENT 
N KNOW FOR A LONG TIME ** *** *.* 



CORONA 

ED 



99492001 
99492002 
994 9200 3 



SEVERAL ING SHIN SPECTRAL LINES WERE FOUND IN 
«M» «• *« WHOSE CONNECTION WITH N KNOW ELEMENTS 
KNOW FOR A LONG TIME *» *,*».»• 



CORONA LIGHTS 
ED REMAIN UN N 



99492001 

99492002 

99492GG3 



SEVERAL ING SHIN SPECTRAL LINES HAVE BEEN FOUND IN CORONA 
LIGHTS * M * •#** WHOSE CONNECTION WITH N KNOW S ELEMENT ED 

REMAIN UN N KNOW FOR A LONG TIME ** *.**-* 



99492CC 1 
994920C2 
99492003 



SEVERAL 
## ;■** 
REMAIN UN 



ING SHIN AL,:SPEGTR ES LIN WERE FOUNO 
OF WHICH THE RELAtlONSHIP TO N KNOW 

*.* 



IN CORONA LIGHTS 
ELEMENTS ED 



99 4 9 20 Cl 
994 9 2 GC 2 
99 4 9 20 03 



99492C0 1 
994 
9949200 



N KNOW FOR A LONG TIME **** 

SEVERAL ING SHIN AL SPECTR ES LIN, HAVE BEEN FOUND, . IN 
LIGHT S *M* **':*> OF WH I CH THEf REL AT I ON SH I P TO N KNOWS 

ED REMAI N IJN NKNOW-FO R--;‘A^C NG.— T-I ME -* *- -• »-*•;-*. * ‘ 

• - ’ . 

W FRF ED DISCOVER I NG SHIN SPECTRAL ES LIN *E# SEVERAL 



CORONA 
ELf MEN T 



Si 




99492 
99492002 
99492003 



REMAIN« 5 UN> N KNOW :S A i: LONG T IME,** *»* *.* 








994 94 DC 1 
99494002 



h. EDLEN DID MANAG TO GET THESE SPECTRAL ES LIN IN SUITABLE 
TERRESTRIAL (KJS LLP IN S SOURCE I N . UP5ALA . NOT UN I I L 1941 , 



994940C l 
994940C2 



IU E D L E N DID MANAG TO GET THESE SPECTRAL GS LIN IN SUITABLE 
TERRESTRIAL OUS LUMIN S SOURCE IN UPSALA NOT BEFORE 1941 - 



99494001 



D. EDLEN DIO MANAG TO GET THESE SPECTRAL ES LIN IN SUITABLE 



Fig, 5-3 



99494002 TERRESTRIAL OUS LUMIN SOURCES IN UPSALA NOT UNTIL 1941 



99494QC 1 
99494002 



D. EDLEN DID MANAG TO GET THESE SPECTRAL ES LIN IN SUITABLE 
TERRESTRIAL OUS LUMIN SOURCES IN UPSALA NOT BEFORE 1941 . -* 



99494001 

99494002 



0- EDLEN DID MANAG TO GET THESE SPECTRAL ES LIN IN SUITABLE TAL 
TERRESTR OUS LUMIN S SOURCE IN UPSALA NOT UNTIL 1941 . 



99494CC1 

99494002 



B, EDLEN DID MANAG TO GET THESE SPECTRAL ES LIN IN SUITABLE TAL 
TERRS ST R OUS LUMIN S SOURCE IN UPSALA NOT BEFORE 1941 V ' 



9 94 9 4 DC 1 
99494C02 
99494003 



B. EDLEN DID SUCCEED IN TING GET THESE SPECTRAL ES’LIN IN 
SUITABLE TERRESTRIAL OUS LUMIN S SOURCE IN UPSALA NOT UNTIL 1941 



994940C 1 
99494002 
994940C3 



B. EDLEN DID SUCCEED IN TING GET THESE SPECTRAL ES LIN IN 
SUITABLE TERRESTRIAL OUS LUMIN S' SOURCE IN UPSALA NOT BEFORE 194 1 



99494001 

99494002 



B, EDLEN DID’: SUCCEED IN TING GET THESE SPECTRAL ES LIN IN 
SUITABLE TERRESTRIAL OUS LUMIN SOURCES IN UPSALA^ NOT UNTIL 1941 



9 94 9400 1 
994 94 C C 2 
99494003 



l B ^ EDLEN :DID^ SUCCEED IN; ; TING -GET rTMSE • SPE^ 

SUITABLE TERRESTRIAL OUS LUMIN SOURCES IN UPSALA NOT BEFORE 1941 





o 

Kn 



.<* o 

' • 

bo 

-H , 




ifr £*•- - ^ if.V.-i'A ?.- L V r S A^f ';A V vV 






Figure 7-1 



486 



BEOBACHTEN : 
1 . / 



SBJ 

+HU 



OBJ 

+AG.C 



observe , watch 



Ex: Mark beobachtete Sylvia = Mark watched Sylvia. 



SBJ 

+HU 



OBJ 

+ACC 

+AB 



an 



OBJ 
+ DAT 
+HU 



observe in sb , , 
in sb . 



notice 



Ex: Mark beobachtete Zeichen von Triumph an Sylvia 

Mark noticed signs of triumph in Sylvia, 



SBJ 

+HU 



ADV 

+MAN 



observe 



Ex: Mark beobachtet gut = Mark observes well. 



4. / 



SBJ 

+HU 



OBJ 

+ACC 

+AB 



follow, obey, observe, 
respect, comply with 



Ex: Die Roemer 

observed the laws 



das Gesetz - The Romans 



F RE I LAS SEN : 



l.V SBJ 
+ HtJ 



OBJ 

P;+ACC 

■J-+HU 



free , set free , liberate 



Ex:Marklie5sSylviafrei=Markset Sylvia free, 

2./ SBJ - OBJ leave blanjk-; leave open 

+HU +ACC 1 e av e _vap3"o t , ~1 eave ” 

” HU < visible- 20 



Ex: Mark liess eine. Zeile frei = Mark left a line 

• - ■ '••••' K.l ot»V- • -• •" • ' • 




Figure 7-2 



Ex 2 



3, 



SBJ 

+HU 



Mark trat leise auf 
gegen 



Mark trod softly. 



OBJ 

+ACC 



come up against, rise 
against, oppose 



Ex: Die Griechen traten gegen die Tuerken auf = 

The Greeks rose against the Turks , 



4, 



SBJ 

+HU 



fuer 



OBJ 

+ACC 



Ex : 



stand up for' 

Mark trat fuer Sylvia auf - Mark stood up for 



5. 



SBJ 

+HU 



vor 



OBJ 

+DAT 

+HU 



perform before 



Ex: Mark trat vor dem 

before the king , 



Koenig auf ** Mark performed 



6 . 



SBJ 

+HU 



als CMPL 
+NOM 



as , pose as 



Ex : Mark trat als Koenig 



7. 



SBJ 

+AN 



wie CMPL 
+NOM 



auf - Mark posed as a king 
behave like, act like 



Ex: Mark 

a duke . 



trat auf wie e in Fuerst - Mar kb eh ave d 1 ike 



8.V 



SBJ 

+AB 



occur 



, « 



5en, arise * 

, result , ensue , - appear 



Ex: Ein Fall von Cholera war aufgetreten = A case 

• 1 - * ' ' of cholera had occurred . - 1 ' - : -' J - 



appear, perform, enter 










490 



Figure 7-3 



SBJ 

+HU 



OBJ 

+DAT 

+HU 



appear to sb 




Ex; Der Geist war Mark erschienen - The ghost 
had appeared to Mark. 



SBJ 



OBJ ADJ 

+DAT 

+HU 



seem, appear, look 



Ex: Die Loesung erschien Mark gut The solution 

looked good to Mark. 



ADV 

+MEAS 



wide , in width 



Ex : 



drei 



er breit = three meters wide 



N 

+ P0 



broad , wide , spacious , 
large, vast 



Ex 



N 

+AB 



ein breites Gesicht - a broad face 

.'•••..T' v • • ’ '/■ extensive 



ansghliessen;::;; 

i. : f:sBj 



Ex : eine breite Darstellung = an extensive description- 



■£> /i; 






OBJ 

+ACC 

:-AB'v§: 






chain , connect, fas ten 
witlV a4 dock 






• - ' ■■ -v . , ' ■. 



• . -v / vy- / : V 



f *•<■■■'■ raw; 



Ex: Mark schloss das Fahrrad an = Mark fastened the 

bike with. a lock. 












V- ■ ' ■ . . ■ . 2 . SB J 1 ( 

+HU • - 

■ • a .. . 

ItttSSi! ifsSll 



cc 






' a • ' 

. '-’V , 




mm-m. ^ 





Figure 7-4 

Ex: Mark schloss das Fahrrad an den Zaun an 

Mark chained the bike to the fence. 



4 , 



SBJ 



OBJ an 

+ACC 

+AB 



OBJ 

+ACC 

+AB 



add to 



Ex: Mark schloss die folgende Bemerkung an seine 

Rede an = Mark added the following remark to his 
speech , 



SBJ 

+AN 



OBJ 
+ REFL 
+ACC 



OBJ 

+DAT 

+HU 



accompany, join 



Ex: Mark schloss sich 

Sylvia. 



Sylvia an = Mark joined 



SBJ 

+AN 



OBJ an 
+ REFL 
+ACC 



OBJ 

+ACC 

+HU 



j o in 



7 . 



Ex_: Mark schloss sich an Sylvia an » Mark joined 

be adjacent to, 
border on 



SBJ 

-AN 



OBJ an 

+REFL 

+ACC 



OBJ 

+ACC 

-HU 



Ex: An Texas schliesst sich Oklahoma an 

borderspn Texas . • c3 ' 



Oklahoma 



B . / SBji 



follow 



Ex : We iter aussen schliesst die Sonnenkorona an = 



491 



The corona of the sun follows . further out . 20 




Figure 7-5 



492 ZUORDNUNG 



1. 



zu OBJ 

+HUv+AB 
+ DAT 



as 



ca- 



lationship to , con - 
nection with 



2. 



N 

+AB 



coordination 



ZEIT : 
1 . 



N 

+ TIM 



time 



\94 GEL INGEN: 



V 



SB J \ 



+AB \ 



\ 






OBJ 
+ DAT 
+HU 



succeed in 



Ex : 



ats Experiment gelang Mark = Mark succeeded 
experiment . ■ ■ ■' 



in the experiment. 



SBJ 

+AB 



v.:- 



be successful , 
succeed , wo rk 



Ex* Das Experiment gelang - The experiment was 
successful. \ : ; A ( ■ •- ' 



3 ,/ es 



zu \INF,j .OBJ,;. : ;; r s ucceted in + Gerund^- 
+DAT. manage to + I nf- I 



■ : : 

V 



Ex: Es gelang jMark, das Experiment durchzufuehren = 

Mark succeeded 1 in performing .the experiment. 

■ 



ERHALTEN: 



_ , , . . . , 

. . 4 , , '*&Mp . v,> 









Figure 7-6 



3 . 


SBJ 


OBJ von OBJ 


maintain sb 


. on, 




+HU 


+ACC 

+HU 


+DAT 

+AB 


support sb . 


on 




Ex: Mark erhielt seine Eltern von seinem 

Gehalt = Mark supported his parents; on his 

salary . ( 

. . { 


mageren 

small 


.4. 


SBJ 


OBJ von OBJ 


subsist on, 


support j 

i 

.. . ' ■ . ! 
- - ■ ■ • 

! 




+HU 


+ACC 

+REFL 


+DAT 

+AB 


onself on 

, ' j . '■ . : ••• 




Ex : Mark 

on alms , 


erhielt 


s ich von 


Almosen ■ Mark 


subsisted! 

■ .T: /. 






REQUIREMENTS FOR MACHINE TRANSLATION: 
PROBLEMS, SOLUTIONS, PROSPECTS 







Ro 1 £ rs t ach owi t z 

{Vf-: « •;> ‘ 7j. •• ; f V. . : *!/• v - ; ; r? ; iv f ■ 












- r 4v:*-$\ viv 






V/ :V 



••• . : 1 

'4 ■ : - 



if - \''; c%\' \i •' • - /.V /•' 

.;' • " : ■ ■ ' • ; • : . •'; : : •' ■ ■/•:.: ; • ; •. ; •,./.•;•• • 7 • ' • &'• . ; ' ; • - : ; ; -y v ;:••/_•-■•• ^ ; :; j_ v ; ^ ./ ; : . \ _■ -••v/ r 



. ’ • ... .• :' .. • •.. ..:. : : -..\ • . ••.•. r: 






: ...... . : :■ ■ _ . .. . / . , ■ ' ^ :•■ 

§ ■•:-• t&Ti '8; 377, ■ .•■: . ■■' V.';' - : ■■- i8;> •: ’ ■■ - 3 3 aASA-'A 38 SS ' Sf'i . / .' . •' - ■ 8 - 'A 

•'••--•• •■ • • ■•.••'• . ■ • . •■, ••••• •• ■••••• .•••' -• •-• ..-• --;• • ■- ••■ ■ •' ■-■■-;• • -’ : "• - ••'•-• : '- ' -• • •••••• •• •••-•• • - ' •■ •.'■•■ / .•• •.•'-•■ . • ' • .. 

- . . J .. : . . . . - ■ . . . . _ 1 




ISC.,,. 
wsMtmwmdm, 



■■ ■ :-■ : " : !-..A 8 :’a:,aavxa 








TABLE OF CONTENTS 



1 

2 

5 



Introduction 

Comprehension and Translation 



page 



U 1 0 



Desirable Properties of a Translation 
Device ' - V; < r ‘'v 



page 420 
page *:■ 3 $ 



The Capabilities of Current Competence 
Models or The Properties o£ a Realizable 
Mechanical Translation Device 



page k60 



5 

6 



The Linguistics Research System 



Progress in Hardware Development and the 
Future of Machine Translation 



page A 6 8 
page 506 



Footnotes 






Xi Lexicographic Work at the 



Linguis tics Research Center 



page 5 1 5 
page 521 



Bibliography 



•4 









■h 









• - 1 - 1 \ 









: y?;^. &.■ 






’• -• • • • • V: ; - .‘t J-.- -L ; ; .-.As; i , Jj:.; ‘T-'V V V ....'A ji.i T, A Vr^. ;-i 1. J . : ; ’.‘i- 

...-•.V-'- — - ::va i ; .• r-. .u ^ ^ r; ^ c i - V- ;]] : ;V-“ 

.-I:". ■ ■'■ •- ■■ "•>. \\ V- - ■ , V- i.;... - f v i *v v* v - .. : :-VV' - ^ >■' •" 

//-..v-'. - ; ; • > :h •; _."v : y.. %}J?- . -;-v" ;.i 




. .• • . '- A,.- •• •" • - . .... • •> * • .v- 7 •; • / .Vr .• •• ••' t. A. t-s : -,c> / • -v ,0 a-;" 

. - - . ■ • ■ , ■ 

— " 












vrH • •: 

IlillllSsI 









1 . 



Introduction 



Today it is generally accepted that the expression "science" 
no longer refers to a discipline which deals with a particular 
subject area but in general to any discipline which uses a 
particular method of research: the so-called "scientific 
method". We classify various disciplines according to whether 
they make use of the scientific method or not. Thus , we 



exclude disciplines like history or literary analysis from 

1 



the sciences. 



We shall only deal with two of the criteria which constitute 
the scientific method: intersubjectivity and verifiability. 
Intersubjectivity means that the result obtained by one 
person starting from certain assumptions and working 
according to a particular- method should be obtainable by 
other persons operating with the same assumptions and the 
same method. By verifiability we mean that the statements on 
certain phenomena in a particular research area have to be 



verifiable. The "principle of tolerance 

' ' ' •’ 

:V --.TT.tf.-- 



— — f 1 






empi ric4 

, m '' ' . ~ ~ 

(Toleranzprinzip) y formulated originally but^ later abandoned v ;• v 

by Carnap , no longer holds in the sciences . Introspective , 

n n 1 — • _ I. i » J 1 _ 



i £.;■ 





The development of linguistic theory and advances in computer 

: s \ 

hardware and software have put linguistic science into the 
forunate position of being able to verify by computer the 
various hypotheses and theories made lab out linguistic 
phenomena because of a correspondence' between formal languages 
and programming languages: everything that can be formalized 
can be programmed and. vice versa. A lumber of computational 
linguists have consequently written programs which process 
transformational grammars, so-called grammar testers, and 
have made them available to the linguistic community. The 
linguistic community has as yet made little use of such 
programs . The few linguists who have had their grammars 



processed by such an algorithm soon found out that their 



hypotheses were falsified. 



;IL 



The reluctance of linguists to use a computer is, of course. 



based on the fact that there is no comp rehens ive theory of 



;th of time required 



grammar ^ that works . \ Es timates oii the len 

to construct such a grammar vary considerably.' We^have heard 

opinions indicating k time of about 500 years . Though we are 
A j ^ „ . , j. . . j-. .. I .. mS 



may take about 150 years i of grammatical research to come 
up with a Comprehensive grammar for a laftgtage. ■ 

■.'■•I- ' ' ' 






I . . \ 

theories? He can resign himself to the view that language 

is a phenomenon which cannot be treated algorithmically, 

I 

at least not from a recognition point of view, which is 
true for formulas of the predicate calculus We personally 
are disinclined to accept such a resignation, since we know 

| ... | -V ■ :■ 

that everybody can speak but not everybody can prove logical 
theorems i ■■ - ] 

The second possibility is to/ assume that graiunars are indeed 
highly complicated and that we must work patiently, hoping 



that future generations wi II be able to make 
preliminary work. 



use of our 



The third possible course of action, the one We are going to 
follow, is to investigate /whether all the scientific and 
methodological premises of current grammar theory, especially 
its descriptive and explanatory apparatuses , 
necessary!, or whether ;they; can be replaced by 



ire really 
a. simpler 



system of apparatuses under preservation <of the observational , 









descriptive , and exp lanac dry hdequacyi ’ We shall thus treat 








i I 

correspond to a competence model, as grammar models are 
normally called, and to its various components, the deep 
phrase -structure component 1 , the transformational component, 
and the semantic component? CBor present purposes, we shall 
ignore the phonological component . ) Which are the phenomena 
explained by such a model, which remain unexplained? 

To accept the stipulation of transformational grammarians 
that competence models not be, regarded as performance models 
imposes a heavy burden on bur research, but instead of dis- 
cussing whether such a request is legitimate, we decide that 



we can still investigate such models and their components 



as part of a hypothesized performance model. 



It is very difficult to believe the claim that a grammar 
of a language with a finite set of terminal symbols is an 
adequate representation of a phenomenon that occurs almost 






any day: the introduction of new words in a language , which 



® i "the r name new ob j acts o r whi ch are i ntroduce d by me ans of 
def ini tions.A grammarmodel asit is normallydefined is 

basically static, something that, I believe, Humboldt would 

1 - ‘ ‘ ' * 



have called, not an energeia but. an ergon, incapable of 

runroEontirtn +U ~ m \ 





the specific mathematical achievenent of the development 

of a general procedure - renders valueless , as it were , 

% 

the area covered by this procedure." ) 

Which possibilities for verification do we have for a 

\ V 

competence model? \ j 



a) We could check its output. Apart from the fact that 
this output does not exist yet, this j criterion, if used 
alone, could also be used to represent as a model for the 



human capability to divide and multiply a computer program 



j / 



which performs division and multiplication by iterative 
subtraction and iterative addition. 



b) We could consider the structural description which is 
assigned to surface sentences . We ; grant that the structur - 
al description which acompetence modelassignstoa surface 
sentence corresponds to our linguistic intuition. However, 
w© ■ see no means to decide that such surface structures are 



***.’:*.¥ Z* 1 ** 1 ? 'V *•* is: j -4. ii‘‘ .V i ’yv'. -v 



derived transformationally from /deep structures ; they might 
equally well be derived from a surface phrase-structure ” 






component ‘ Recent development in standard ■ transformational 



;- t v ;VTV 



grammar which makes the deepvst rupture representation cor re 






spend more and more to the surface representation actually 

. 

' ' arniioc * _p - .■ - , ' - ' ■ *, v > : ' • - v ‘ • . pi •: - ;■ : >. 





categories, as subject o£ a sentence or predicate of a 
sentence, has already been shown by various transformational 
grammarians not to be applicable for such semantic cate- 
gories as objects or adverbials in the case of verbs with 



multiple objects and multiple adverbials. This claim, I 

4 



believe, was shaken by Charles Fillmore , who pointed out 
that the deep representation is not really a representation 
of semantic relations between constituents. This has been 
admitted by Chomsky if I understand his comments in "Deep 
Structure, Surface Structure and Semantic Interpretation" 
correctly. Others pointed out that important linguistic 



concepts as "head” of a phrase cannot be expressed by means 

5 



of the deep phrase-structure component. 

Which reality corresponds to the transformational component? 



We do not doubt that transformational relations exist 
between surface structures. But, as far as I know, there 



is no empirical verification for the existence of ordered 






transformations ,^he few examples , all based oh reflexiv- 

.. A-- - - ■* •«-' ' - s - - 

ization, can be explained in a, different way. 

Which observable phenomenon corresponds to an intermediate 
phrase marker ? 5 Nb" real investigation has -been performed " * o 

• . ’ . r ■ “ ' * V . ' ' ’ ‘ ‘ ‘ ' ^ V.-- • ' "a / - ■' ' 

on this aspect. The reality of intermediate .phrase, markers 



_ gig 






as "Give Harry the book written by John". We know that 
the string, by means of preposition deletion, eventually 
results in "John gives Harry the book". 

Which experienceable reality corresponds to a semantic 
component, which cannot explain the process of introducing 
a new word by definition, the modification of meaning by 
explication, which cannot represent in a sentence reading 
the synonymity or the occasional intersection of the 
semantic readings of two words expressed by the '^explicative 
or" '.('corresponding to the stylistic term "hendyadyoin") 
when no individual term in a language represents that 
s emantic 



The rigor which had been introduced into linguistics by 
means of the notion of rules and transformation rules 
in the earlier version of transformational grammar has 
gradually disappeared. We are not able to relateythe 
surface phenomena that we can observe to the semantic 
representation or the deep structure since the increase 
complexity of the transformational apparatus makes the 
establishment of such relations and their verification 
extremely difficult if hot impossible. The "remedies " 
which have been proposed: to make the deep structure 

more and more similar to the; surface structure or more 




Peters and Ritchie,® 



In a science we set out to describe the facts that we 
observe and to try to relate them, to find an explanation 
for them, a system, a structure. The principles that in 
general are used in setting up the observational and ex- 
planatory apparatus are that they should be adequate and 
appropriate. These principles are also influenced by 
certain esthetic considerations : that the apparatus should 

be as simple as possible. From our point of view, this 
means* We now know a lot more about linguistic theory than 
we did twenty years ago. We know that language is the 
language of man, whose capabilities we should not exclude 
when dealing with language. We should begin research 



again by relating surface sentences to surface sentences 
by means of transformations, but by means of transformations 
Which are kept as siripie as possible, which relate surface 









since the person who started it all, Zeilig 8, Harris, has 



been describing such a model for some time, 9 Our own 
model, which we are going to describe in Chapter 5 of this 
paper, is based on the notion of Harris’ substitution 
transformations , It has been constructed with the aim to 



explain certain human capabilities, among them the acqui- 
sition of new words and their definition by means of the 
context. Our grammar is based on the assumption that 
sentences can be represented as connections of elementary 
predications. Thus the sentence "A young girl sang a 
song" is representable as the sequence of connected pre- 
dications : 

girl (x 1 ) Ayoung (x-lD Asong Cx 2 ) Asing ,x 2 ) 

Sentences are not generated by rewriting the initial 
symbol S but by reducing them to symbol S both during 
recognition and production. The model is ; ; a representation 

•' r -. S ^ : - 

recognition in that is derives meanings from surface 
s !S tenc es ; a model of production, in that it derives surface 



sentences from a representation of their meanings 
... ’ " • ; • " ■ - . 
Whe11 1 proposed, after my experiment in paraphrasing 

and translation^ 0 tn V 




O'-'-:' 



Winfred F. Lehmann, Rowena Swanson, and Zbigniew L. Pankowicz . 

In order to prepare for the discussion of our model, we shall 
introduce in Chapter 2 a simplified model of human compre- 
hension. In Chapter 3 we will discuss the requirements 
for a quality or high quality machine translation system. 

In Chapter 4 we discuss the capabilities of current 
competence models from the point of view of applicability 
to machine translation. Chapter 5 gives an outline of 
our model, the Linguistics Research System. Chapter 6 
discusses primarily a development in computer storage 
whose impact on the scientific community, in particular 
on linguistics and linguistic studies, cannot be estimated 
yet . 




2. Comprehension and Translation 



In order to describe and clarify the extent to which 
translation of a text is dependent on the comprehension of 
that text , we shall construct a simplified, restricted model 
of human comprehension and determine the components of this 
model which will have to be part of a translation device. 

To facilitate the description of such a model we shall 
introduce the following terms by example : State of affairs, 

s t ate - of -affairs - des crip t ion , the image of a state of affairs , - 

the image of a s tate -of - af fairs - des crip t ion . ' ■ Lt-v,'; 

Assume that a number of people observe an incident Q, a traffic 
accident , involving two objects: a car and a pedestrian . 

Two or three observers make the following s tatements about Q : 

1) There was " a - car -pedes trian accident 

2) A car hit a man. 

-3) A Porsche hit a man. - - - ---- - ■ - - ------ - 

We shall s ay that the statements 1 through 3 describe the same 

c- -4- ~ ^ r\ f e \ t. -.-i-fu ■ 



state of affairs Q (SA Q) , though with different information 
content. We shall call each statement a description of SA Q. 









or an SA- descrip t ion of Q. Clearly, each of the statements 
is not onlv an SA-descrititinn nf fl Kur r»-F <=: ovpra 1 ' S4 1 c ■ -f-Tincr • 




We shall further posit that every sentence, whether command, 
request, question or statement, is a description of some SA. 

An SA need not have any physical reality. This follows from 
the fact that an SA-description may be false. 

Let us now assume a device K - with several components - which 
can process SA- descriptions , store them and reproduce them; 
it can also assign to an SA- description p all the syntactic, 
structural descriptions of p; it can further associate one or 
more images with each SA- des crip tion . Thus , K associates the 
different images a, b, and c with the SA- descriptions 1, 2, 
and 3, respectively. However, K associates the same image d 



with the SA- descrip tions 4 and 5 ; it associates image e with 
the SA- descriptions 6 and 7, image f with the SA- des crip tions 
8 and 9, and two different images g and h with the SA- descrip- 



tion 10. 

4) A Porsche hit a many. • 

; 5) A man was hi t by a Porsche &>l : 

6) The man scaled the fish. 

7) The mahldesquaMated 

8) A car , a Porsche , hit a man. 



f image d 
r image e 

car ^ r or sene • nin ia^jnan s&tigwt, 

9) A Porsche , which is ^a , c_ar hit ,a man., » g 

. 

10 ) G eorge ob served a man with a teles cope 



9) A Porsche, which is a car, hit a man. J • ge ■ 

- ■ ■ - ’ •' • • - 1 \ 

- v." . I 1 TTlfl P6 : ■ - 

inY fionrcr^ a man t.r i ■hti o fol r 

lltg 



v ““ " “ v.-ve , , the rclal 



between a DSA- image and an 



SA- description are similar 
andl*an:sSA-r 



lage an ci^an^ 

. those; that hold between an SA 






: •' ■ • - 



111 ® 



■ tv-/' : . .. v- ! ' r .v,.v ’ i- 




mm 



m 






■ 






■mm 






W, 



mmm 



m 



rn 



MM 



mm 




. , ^ 




more than one SA- description . An SA- description can be 
associated with more than one DSA- image; whenever this is 
the case, we call the SA-description ambiguous.) 

Let us now clarify the term DSA- image by constructing the 
DSA- image for the following sentence: 

11) A woman sold a car to a man for some money. 

For the time being let us refer to the woman as Ay to the car 
as to the man as C , to the price as D, Let us now describe 
what happens during a sale of some property, a) A, the owner 
of B, gives B to C. Let us represent this by the following 
graph (arrows represent relations and unary actions): 




Figure 1 



where "l,i" represents A gives to C at time i" , " 2 represents 

M A gives object B at time i" , "3,i” represents "A owns B at 
time i". Note that the ternary relation M A o"i-v««s- "R "tn' r*» * « ' - 




the money D, gives this money to A as a compensation for 
the acquisition of B. This results in the Fig. 3 where k is 

D 




Figure 3 



later than j , The double arrow, between D and B, represents 
a symmetrical relation. 4 ( B , D ) stands for "B is a compensa 
tion for D" , d) Finally, A acquires property of D and C 
loses property of D, resulting in the graph 




Figure 4 






•• - 1 1 . . ! . 



. .'.-t' • : : : -i- ? v-.-' -J ; 7-. . a ^ A-....y, * -■ v.r; -;.y . ' 

A-a : • ■■ / . ",-v •'.-v- A/A ..= Ay .A.'- v- VA- : .v •.-'•A ri- ' 'AV-Av-^A-- ;> a.a. u A yo -a ... " Vv-vv;-./ 

‘ ,-y .• r-\ •;•• •. ‘ -t •• •;*- y,--- '.r> - ■=■• ■ : r"v " ■■ - ■ ■■•*- •- •• - ■ 

: ' >’ A;--;y A'-'- ; v; : - V; "Ay- 'W Av'^v-; r.‘.< y : a ;.-A . •' ■ -A 





where the graph • — — - — ^ represents a property’ of the node, 
and a line perpendicular to a property (or relation) a 
logl cal or , S represents the property human; and 6, the 
property legal entity. The sold object B must finally be 
an object or a right to some object, D can be an object, or 
the right to some object, or money, which will be represented 



in the graphs 



7 , i 



8, i 




Figure 6 



where 7 represents the property ’’physical object", 8 the 
relation "right to" and 9 the property "money" . 




Sentence 11 * : "A sells B to C for D" thus results in. the 
following DSA- image : 



fhe following conventions have been us 6 d in this figLme i 
An expression of the form "number*, +*+letter" (e.g. 3 is to be 

read as "The property or relation represented by the number 
ends at the point of time represented by the letter". An expression 
of the form "number +, +letter++" is to be read "the property 
or relation represented by the number begins with the point of time 
represented by the letter". An expression of the form 
number +, +letter" is to be read as "the property or relation 
expressed by the number begins at and terminates with the point of 
time represented by the letter". An expression of the form 
"number" (with no letter) expresses that the property or 
relation has no time boundaries . We prefer the representation 
in Figure 7 to the equivalent representation in Figure 8. 



Figure 8 




o£ sentence 11, we still need to perform the predications 
upon the objects referred to by the expressions M a woman", 
"a man", "a car", and "money". These will be represented 
in that order by the following graphs : 





4 



11 = adult 



13 - car 



Figure 9 



Sentence 11 will result in the following DSA-image: 




Figure 10 



' " , • \ * >< “ - ^ - . 

" ' • ••'''• 1 .vl: .- 1 - .7 ,.v7 V lfv-777777 7 ; ’.'V 777 '••’ 7 

Note that in comparison with Figure .7 predications 

77 \ . • - • ; . - 7 777- ‘ ;77 ■ •7:;-777...^ ■ ’ 7 •' 7 '-7-7 7 , 77 r-7-w v,,. ’77 7- 7-7. 77777 7:: A: ..... - v 



. 'V - 




We shall further assume that device K contains an additional 
component in which SA-images , images of the original state 
affairs, are stored. Each SA-image is generated by means 
of the information provided by a DSA- image by replacing the 
object variables by constants. The SA- image constructed from 



sentence 11 would be identical with .the BSA-image in Figure 10 
if A, B, C, and D were replaced by x x , , x 4 , respectively. 

Each SA-image t of SA Q is consequently a partial, i.e. imper- 
fect, representation of the original SA Q, 



A further component of K is able to superimpose two SA- images 
p and r of an SA Q and thus derive an SA-image v of SA Q by 
modifying - during the processing of a text - the current SA- 
image p of SA Q by means of the new SA- image r of SA Q-, the 
result is a more precise representation of the original SA Q: 
SA-image v. Let us call such superimposed SA- images 

connected SA- images , This component also deletes all but the 
SA-image t of an ambiguous SA- description, as well as their 
DSA- images if t was connected with some SA-image q. This 



cap ab i li ty me ans that the device is able to connect SA- images. 






represented in different SA-descriptions , similar to the 









(2) A car hit a man. 12) It was a Porsche, 

respectively, then, when each has processed, its first 
sentence (4 and 2), the contents of the data storage of the 
two devices will be different in at least three respects: 
each will contain a different SA- description; each, a 
different DSA-image, and each, a different SA-image, When, 
however , has processed sentence 12 , both devices will 



have an identical SA-image, That is, the sentence 

(4) A Porsche hit a man. 
and the sequence of sentences 



(2) A car hit a man. (12) It was a Porsche, 
result in the same SA-image. 



When device K processes sentence 10: 

(10) George observed a man with a telescope. 

• '-l X '■ ^AArJ+i'., l'.; 1 .X : -* ^ ^ r J/’ ;V. Hi - 1 XwTV* - • 7. V.7.- 

it will construct two DSA- images and two SA- images ; this ■ 
expresses the ambiguity of this sentence. If K subsequently 
processes sentence 13: 7 X • ’: ; x> ■ :- r 





.. _ ,i.- image component from 

the set of configurations of tb 7 * 

m ^ry of K; the set of SA-image 

state through T fi , the w l 

states immediately preceding the 



of the 



the short -span memory of K, 



16 



current state o F ‘ * 

- uine that device K has a meaning rule component 

rur tner a? 

with 4 *if ©rence rules, statements of definitions, and equi- 
unce rules , Examples of inference rules are 

a) For all x: if x is a Porsche, then x is a car, 

b) For all x: if x is a car, then x is a vehicle, 

c) For all x: if x is human, then x is animate. 

An example for a definition is \ 



the SA-image in Figure 11 the SA-image in Figure 7», 

(in Figure 7* A, B, C, D of Figure 7 have been replaced by 
x, y, z, v, respectively.-) 



v 

* 




Examples of equivalence rules a.. 



'riven in Figure 12 



f > 



K ) 



Figure 12 



z x 




. z x 



sell 



buy 




X 



These graphs represent the meaning rules: 

sell (<x, _».> , <y , 2> , <z , 3> , <v , 4>) buy C<x, 3> , <y , 2> , <z , 1> , 

<v,4>> = D£ pay(<x,3> ,<y,4>,<z,l> ,<v,2>3 s Df pass (<x v 3> , <y , 1 > , 

<z,4> ,<v,2>) , 



X 



X 



The sentence "A woman [x) sold a car Gy) to a man fz) for 
some money t v ) M can thus also be represented as "A man 
bought a car from a woman for some money", "A man paid some 
money to a woman for a car", "A car passed for some money 



from a woman to a man "X 7 Thus , device K can construct by 



means of the rules of the meaning rule component, in parti- 
cular by means of the definitions, molecular SAr descriptions. 



molecular SA- images , and connected molecular SA- images from 
the SA- images , DSA- images , and connected ^SA- images, which. : X 




structure 



, Thus, the graph in Figure 10, which represents 

sentence 11; "A woman sold a car to a man for some money", 

will result in the graph in Figure 13, (We represent 

molecular images by two-dimensional figures: quaternary 

relations by a diamond, properties of an object by a 
1 8 

rectangle j objects are represented by a dot, the names 
of relations and properties are represented by numbers in 
the geometrical figures. The names of objects occur 

besides the dots, the numbers on the lines between 
relations and objects represent the order of the arguments.) 




correspo ndi n g t o th e grap h 



3 , x 4 , respectively, 

' . . • - ■ ■ 



33 ./! 






111 .Permanently storo only molecular 
molecular SA- images , since it can 



DSA- images 



Ifciffit 



9 - money 

29 - sell 

30 ■ woman 

31 " car 

32 = man 



Figure 13 


















construct the corresponding atomic DSA-images and SA-images 
by means of its meaning rule component with its definitions 
and inference rules, when required. 

We suppose nobody will seriously doubt that, indeed, 
connected SA-images, atomic and/or molecular, or simulations 
of them are stored in comprehension devices, as e.g. in the 
human brain, or that SA-images are necessary besides DSA- 
images . Without this assumption, it would be fairly 
difficult to explain the inconsistencies in a number of 
SA- des cript ions of some SA R when no two of them are 
inconsistent. Let us demonstrate this by the following 
three SA- descriptions of the same SA which may occur 
distributed over some text. 






13) The final conference on the ''Theoretical Study 



Effort of High Quality Translation" was held in 



Austin, Texas, from January 11 through January 15 , 

. . 19 71. 

14) When the final conference; on theK VlTlreore t leal/ Study 






Effort of High Quality Machine Translation" was 

„ . ■ 



held', it rained every day in Austin. 



wm 



, • 

15) No rainfall occurred in Austin* Texas, during t 



.. •• • ,;w : • . • • • • . ••••.. 



the; 










As we can e as 






ily. verify , each pair of the statements 13 

V ‘ ‘ ■ , . ■ V\/'- .• ' _ ' ■ ' . 

nt , The three statements tpge ther , 

mmMmm im 



!cd ¥r m 

SlftKJL 



M 



... 

432 ; -• 

































mm 



s mm& 






IttS 



ss#i 






BSS® 









■■ • 











statements 11 through 13 does not simply follow from the 
connected SA- images representing the state of affairs 
described by statements 11 through 13. For this we need 
an additional component, a logical component. 

That a process corresponding to the connection of SA-images 
actually occurs in the human brain is most obvious whenever 
a hearer encounters a sentence which - in isolation - is 
semantically anomalous or possibly even contradictory. Thus, 
sentences 16 and 17: 



16) Haensel broke off a part of the roof and ate it. 

17) This boy is a girl. 



which are not semantically well-formed, i.e. whose DSA- images 
are not "well-formed", make sense in their proper context. 
Sentence 16 occurs in Grimm's fairytale Haensel und Gretel , 
sentence 17 in numerous stories in which a girl , iri order 
to be near her lover, a soldier, disguises herself and 
joins the army; Her true /identity is eventually discovered* 
In th,te case of sentence 16, the system has stored the fact;;;;. 




that 




the house and its parts are edible . Thus 
sentence 16 is compatible with the estat 
the current connected SA- images , though the 

|| ; 





sentence 16 violates at least one of the rul 
meaning rule component. In the case of sent 
contradictory and thus logi cally f als 

433 



that one of the predications a and b with the argument 
(the disguised girl) in the SA-image of sentence 17 



a ) is a boy and b) x ^ is a girl 

is not consistent with the current SA-image pertaining to 
x . . The system, depending on outside information, either 



rejects predication (a) as false, or predication (b) , or 
both . 



We shall now introduce the last necessary component of device 
K, So far, we have tacitly assumed that an SA-descrip tion 



describes an SA that occurs or exists outside of K, An 
SA-description may, of course, also describe SA's inside 
of K, as, for example, components, meaning rules, states, 
SA-descriptions , DSA-images, and SA-images. We shall 
classify two devices J and K, with the same properties 
mentioned so far and identical internal configurations, 
according to the way they process of react to the following 
statements: 

17) Did Mary sell a car? 

18) What did Mary sell? 

19) Mary sold a car. 

20) "Mary sold a car" is a sentence. 

21) "Mary sold a car" is not a sentence. 

Device J processes the sentences 17 through 21, storing for 
each sentence the SA- des crip tion , and the associated DSA- 
images and SA- images. Its only in-built reaction is that 
either sentence 20 or sentence 21 or both be deleted from 




the memory, since they are inconsistent. K reacts in the 
following way: (we shall use "SA:x" for "the SA described 

by the SA- description of SA x") : When K has established the 
oA-image of SA:17, it searches through its memory. If an 
SA-image identical (except for the representation of negation) 



to the SA- image ^the device J would produce when processing 
sentence 19 has /been stored or can be deduce! from existing 
SA- images by mekns of meaning rules and logical rules, K 
prints out "no'j if at least one negation occurs in the SA; 
"yes", if no negation occurs. If no such SA-image is found, 

K prints out thl stereotype answer: "The question cannot be 
answered, insufficient information." For sentence 18; 

Again, jf e cogni zing that an answer is expected, searching 
through its memoiy and finding a representation of "Mary sold 
her house on the 20th of July, 1969. She got $25,000 from 
Henry for it.", K prints odt : "Mary sold a house to Henry 
for $25,000 on July 20, 1969." K then continues processing 
statement 19 in the way J\processes it. We shall call 
device J>a somewhat sophis ticated language data processor i 
we shall call K a mo del of comprehension or a device with 
rudim entary artificial intelligence . -X ; 

A slightly more intelligent version of K, having generated 
the hSA- images of SA 20 (or SA 21) , will analyze the DSA- 

i mages x — 15 , I (and x — ^ l ) by means of am operation 

rule; ( - j 5 l represents the predicator "sentence") . Ahis 
operation rule , a subroutine called by ^ ^ - | ' ,(o r ; ^ ■ ^ j y 



establishes that the SA- descript. ion is true (false) if x is- 
gene rat ab le by the syntactic component; if x is not gencrat - 
able, that the SA- des cription is false (true). The correspond- 
ing SA- images and DSA- images will be deleted. 



This "awareness" component of K , if modified slightly in the 
way indicated below, would also make device K a restricted 
speech production device. The modifications necessary would 
be : I 

a) K may print out a sequence of SA- descriptions t-j , t •, 

7 

* * * r n * j 

b) each t^ (l<i<_n) is a partial, incomplete representation of 

I 

the underlying SA- image ; / 

c) for each t ^ , t ^ ( l<i<n) : the SA - image of t'/ is connected 

with the SA- image of t i+ ^ * ' / / 

! i 

d) the conjunction of all SA- descriptions t^ (l<i<ri) Is an 



exhaustive description of the underlying SA- image 



19 



By means of the semantic component and the definitions in 
the meaning rule component given in the following figure, 
K can produce the s equence of sentences below. 




where 29 represents “"sell" ; 53 "give" ; 4 , "is a compensation 

for" ; the caret stands for logical and ^ 






erJc 












22) A woman sold a house. A man gave her money for it. 

23) A house was sold. The owner, a woman, got some 
money for it. The present owner is a man. 

24) A woman sold something. It was a house. Somebody, 
a man, gave her some money. The money is the 
compensation for the house. etc. 

In addition to the necessary components already mentioned, 
the device may contain several others, as e.g. a component 
which associates a stylistic interpretation with an SA- 
description t, or a component which corrects printing errors , ^ 

Let us recapitulate the major properties of the comprehension 
device. It is able to store and reproduce SA- des cript ions . 

By means of a syntactic component, it can associate with each 
processed SA-description tall and only the syntactic 
descriptions of t. By means of a semantic component^ it can 
associate with an SA- des cri,p tion t all and only the 
ESA- images of t, It can further associate all and only the 
SA-images of t with SA- des crip tion t by means of a discourse 
structure component. The association component of JC 
performs the connection of SA-images pertaining to the same 

sa. : : '• 

In addition, the device contains a meaning rule component , 
a logical component, and an "awareness"' component. A more 
elaborate description of such a model of comprehension for 

purpos es y of Information Re t f i eyai can be found in our - report • •• 

"Normali zation of Natural Language .'for Information Retrieval" .- 



43 7 




Let us now represent the terms introduced above by their 
linguistic equivalences. An SA- description is a sentence 
in natural language. The syntactic description of an SA- 
des crip t ion is the description of the surface structure of a 
sentence in natural language. An atomic BSA-image re- 
presents the meaning of a sentence in isolation. An atomic 
SA-image represents the meaning of a sentence in context. 
Molecular images may correspond to "semantic readings’.'. 

We are not aware of an established linguistic term which 
corresponds to the set of connected SA- des crip tions in the 
current state T of the device; it represents the current 
knowledge of facts of the device. The term "state of affairs" 
finally corresponds to the terms "referent", "significatum" , 
"denotatum". 22 

We shall call a sentence t synonymous with a sentence u if 
t and u have the same SA-image or meaning. 23 In particular 
we shall call sentence t a paraphrase of sentence u if t is 
synonymous with u , and t and u are sentences of the same 
language. We shall call sentence t a translation of sentence 
u , if t and u are synonymous , and t and u do not belong to 
the same language. 

The purpose of these explanations was to provide the basis 
for a discussion of the components of a translation device 
and, in particular, of the question which of the components 
of a comprehension device should be part of such a trans - 
1 at ion devi ce . • - v ; 








3. Desirable Properties of a Translation Device 



It is sometimes argued that in translation, at least in MT , 
it is not necessary to understand the meaning of a text as 
long as the target language equivalents for the words and 
syntactic structures of the source language can be correctly 
established or - in our formulation - as long as molecular or 
atomic expressions and syntactic structures of the source lan- 
guage can be mapped into the corresponding equimole cul ar or 
atomic expressions and structures of the target language. 

We shall investigate, by means of the following German examples 
and their English translations, the extent to which this claim 
justifiable by showing some of the problems that a mechanical 
translation device T will encounter and will have to solve. 

We shall try to indicate which of the components of device T 
will be involved in handling a particular problem, and, 
specif ically , which components of device K must be part of T. 

(We do not restrict our attention to the translation of 
scientific texts. Statements on the greater ease with which 



such material may be mechanically translated seem to exp res: 



/ 



to a greater extent opinions rather than careful investi- 
.,24 ■■■■' r -:v:. ;r v-' 'V- - > 



gations ; we also assume that MT device T will be able to 
translate scientific texts if it can translate "normal text", 
provided that the necessary vocabulary and their equivalences 
have been incorporated into T . ) 



• - -f. 

ri '.K- • 



439 





















■; . : 






• ...y- 















■■■ 



The first requirement that an MT device should meet is to 
be able to derive the semantic reading R of t from a surface 
sentence t. In particular, an MT device should be able to 
handle syntactic problems represented by the following German 
examples. (In each of these examples, the correct English 
translation will be preceded by a literal translation.) 

1. Vte GeA ahtahte fiaengt mtt etneA ExpZoAton ,an. 

The htAtoAy aataheA votth an exp£,QAx.oyL at . . 

HtAtoAy begtnA uotth an expZoAto n. 

2. Ex, ZZoaa ZhA BeAahetd Aagen, daAA ... 

He Zet hen. nottae a ay that ... 

He a ent vtoAd to he> i that ... 

3. 1 ah habe thm abeA BeAahetd geAagt. 

I have htm but nottce Aatd, 

I gave htm a pteae my mtnd. 

4. Vte Sonne geht tm OAten au£ and tm OJeAten unteA . 

T he A an goeA tn the eaAt up and tn the weAt dooon. 

.< The Aun aZa eA tn the eaAt and a etA tn the weAt. 

5. EaZZz tAt naah Spanten , Aetna EAad viaah ItaZten 
und thAe ToahteA nach GAteahenZand geAetAt . 

EaZZz Za to Spatn, htA tot fie to ItaZy, and thetA 
daughteA to GAeeae tAaveZed. 

EAttz tAaveted to Spatn, htA LOtfie to ZtaZy , and 
thetA daughteA to GAeeae . 

It may be obvious from these examples that the system will 
need the capability to deal with discontinuous elements as 
in sentence 1; it will have to be able to assign a syntactic 
description and s emantic interpretation to such comb inations 
of lexical items within a particular sentence, independent 



/ 




! 

i 



of the syntactic description and semantic interpretation o£ 
the individual items in the dictionary. The same capabilities 
are required for examples 2 and 3, which represent phrasal and 
idiomatic expressions. In particular, the system will need the 
capability of dealing with combinations of lexical items with 
internal variable slots. The items filling such slots may 
either not be translated at all, as in examples 6 and 7; or 
be translated, as in the idioms in examples 9 and 11. 

(Such items are underlined in the following examples.) 

6. V ie E ntwt ekZung nahm th£en Anfiang mtt ... 

Tt id de.ve.Zopme.yit. began u)tth ... 

7. Veh. AufiAZand nahm 4 et nen Anyang mtt . . , 

The tievoZuZton began votth ... 

8. Est, 4 eho&& etnen Bo eh. 

He 4 hot a buck. 

He made a mtAtake.^ 5 

9. En. AchoAA etnen g euHiZttgjui Bock, 

He -shot a taemendou4 buck. 

He. made a taemendojx 4 mtAtake. 

1 0 . Ven £nt& chZiUA &a&4>en, etvo a* zu tun. 

To &etze the deet&ton to do 4 omeththg . 

To deetde to do AomeZhtng. ; \ 

7 7, Ven £e^ten - EvitA chZvMA X&4 4 en, etwa-i zu tun. 

To etze the £tam deet& to n to do t omethtng . 

To deetde de.^tntteZu to do 4 omethtng . 

(We observe in sentences 6 and 7 that the gender of the 



German possessive pronoun, which has no equiva’ mt in the 
English translation, is dependent on the gender of the sub) ect . ) 




The system must also be able to assign a semantic function 
to the constituents of sentences dependent on the meaning 
of those constituents and not necessarily on -their syntactic 
function (cf . examples 12 through 16). Thus , the ad/erbs 
underlined in the German examples 12 through 14 have to be 
interpreted as semantic predicates or at least have to be 
mappable into predicates, given in broken underlines, of the 
output language; the German dative objects in sentences 15 
and 16 appear as English possessives: 

12. Eft a tu dZ e.ftt gjtftn Pky&Zk. 

He. ZZke 4 to 4>tudy phyAZc.* . 

13. Eft a tudZe.n,Ze, ZZeb en. Phy^Zk. 

He. p^e.£g^e.d to 4tudy phy&Zc.4> . 

14. Eft a pfLCLc.lt 

He. aqntZnue.d to taZk. 

15. Eft kam Z hft zu HZZ£e., 

He. came to he.ft ctZd. 

It, SZe. bfietahte. e.4> Zhm zuft Ke.yint.nZ 4 >, 

She. c.aZZe.d Zt to hZt atte.yitZoyi. 

(We may note in examples 12 through 14 that the tense of 
the original German predicate is associated with the English 
predicate which itself is a translation of the German adverb, ) 

With respect to the languages German and English, the system 
should also be able to. translate the German article in cases 



of inalienable property as the English possessive : 

1 7 . Eft kfteuzte dZe. Aftme.. &*':$ : --..V • 

He. c.fto4> 4> e.d hZ4> ctftm 4 > . 




IS. Eft Zegte hhJt dhe Hand au& did SahuhteJL. 

He, put hh£ hand on hex, &houhdeJt. 

We further expect from a translation device that it not only 
associate a correct semantic reading with a sentence but 
rather that it provide the correct semantic reading. That is, it 
should be able to assign to a sentence t all its semantic 
readings in the case that t is ambiguous and should further 
be able to select from those readings the one which is correct 
in the textual environment. 



19. Vhe Mae.nne.Jt hatten dhe FJtauen eJtmoJtdch . WhJt nahmen 
&he dJteh Tage ApaeteJi gerfangen. 

The. men had muJtdeJted the women. (tie. caught them theee 
day 4 hated . 



20. Vhe F Jtauen waJten von den Mae.nne.Jth edmoJidet woJtden. 
WhJi nahmen a he dfi eh Tage, a pae.te.Jt ge&angen. 

The women had been mujideJied by the men. We caught 
them thJiee dayA hated. 

21 . Vhe Maennex hat ten dte FJtauen eamo Jtdeh . WhJt 

beeddhgten the dJteh Tage ApaetCJi. 

The men had muJtdeJted the women. We buJthed them 
thJi.ee day a ZaheJt . 

22. Vhe FJtauen waJten von den MacnneJtn eJtmoJtdet woJtden. 



The women had been muJLdeJted by the men . W e b uJthed 
them thx.ee day a ZateJi.r-. / 

The problem in examples 19 through 22 is the recognition of 
the proper referent of the pronoun "sie" in the second 
'pen ho ::, ce ' of each example . T *-j maint ain that none of the 




four two-sentence combinations are ambiguous. "sie." in 
examples 19 and 20 uniquely refers to the men; in examples 
21 and 22, it uniquely refers to the women. Since both men 
and women can be captured as well as buried, there is no clue 
in the semantic reading of the words "men" and "women" which 
permits the correct association of the proper referent for 
the subsequent pronoun. Thus, "wir. nahmen sie drei- Tage 
spaeter gefangen" in examples 19 and 20, and "wir beerdigten 
sie drei Tage spaeter" in examples 21 and 22 should be either 
ambiguous or vague. We can explain the non -ambiguity and 
non- vagueness of the sentences by the fact that a meaning 
rule "for all Y: if X kills Y, then Y is dead", is used 
when the SA-image of the first sentence of each sentence pair 
is constructed; i.e. that an SA- image is generated in which 
the argument "women" receives the predication "dead"-.- 
Assuming that the verb "gefangen ashmen" requires for 
semantic wellf ormedness a human object that is alive and 
"beerdigen" , an animate object which is not alive , wc can 
e as i ly explain the establishment of the proper referent. 

The reader should not be misled by the:* fact that the • ' 

English translations of the pf ob lemati c^German sentences 
display the same ambiguity in isolation..: That access to the 
established SA-image is necessary will be obvious when we 



translate the sentences into Italian , where the selection of 
the pronoun £e or^-tc referring to the women and the 




\ 



The problems that have to be dealt with in examples 19 
through 22 are , however , not restricted to such apparently 
constructed examples , which are possibly rare in actual 
texts , in particular in scientific texts. It is necessary 
to point out that this problem, in a different appearance, 
comes up fairly frequently in possibly every text. In 
the sentences 23 and 24 the predicate lleAA . . . ^Kel is 
translated correctly as A el ... in the environment 

animate (physical) object, and as te^t ... blank in the 
environment inanimate object, respectively. 

23. Ek Ucaa Sylvia 4 ^ftel* 

We finally a el Sylvia 

24* Ek 1-lcaa AcklleAAllck die telle fifiel* 

He finally le&l Ike ll n e blank * 

However, in German and many other languages semantic features 
of nouns are neutralized when the nouns are pronominalized. 

Thus, the German sentences 23 and 24 both become sentence 25 
under object pronominalization, which, consequently, is 
ambiguous in isolation. 

2 5, Ea. JLLzaa aA.z a ch.Z.'Le.AA £,sic.k fine.*.. ' - - 

The sequences 26 and 2 7, each of which contains sentence 25 , 

correctly show different translations for 2 5 : • * - i ' 

2 6 , M fe o a S (/£ ui w Zazngzh. zh.£h.agzn'. 

E/l I^zaaI: a\L<l a d.hZJ,zAAJLic.h iKzJi. ' ’• , V f/’ • .* ’/ . . 

McLA.fi c.0 u.£d.vi 1 £ bzcch,, Sy&\jj.cL 1 a oh.dzc.JL c\yiy JLongzh., y,.. 




2 7, MaAk wiu>Ate nXcht, wXe eA dXe ZeZzZc ZcZZe 

aiu ^aeZZen AoZZZe. Ea ZZcAA aXc A chZZ.e&AZZch £azZ. 
MaAk dZdn'Z know hou) Zo £ZZZ Zn Zhe Zcl&Z ZZne. 

He &ZnaZZy Ze&.i ZZ bZank, 

It follows that for the proper translation of such, German 
sentences , we need to be able to recover the disambiguating 1 
semantic features from the contextual information which has 
been lost due to the pronominali zation of the -disambiguating 
German nouns . 



It may be interesting to point out that of the 36 selection 
restrictions associated with the eight verbs in the appendix 
of my paper "Lexical Features in Translation and Paraphrasing: 
An Experiment", 13 entries cannot be translated properly if 
the stated semantic feature for subject or object is neutral- 
ized due to pronominali zation. This surprisingly high 
percentage might become even larger if we take into account 
that the semantic features listed in that paper sometimes 
are not sufficient for correct interpretation or translation, 
and additional, more refined semantic features might be 
required. (Cf,, for example, the entry eAhaZZen.)^^ 

Attempts to solve such problems by assigning to the various 
translation equivalents a probability , possibly based on 
criteria of frequency of occurrence, we regard as being 
unsatisfactory. Assume that an item with two different 
translations is -translated as X in 60% of all. the cases and 
as Y in 40% of the cases.. To base'., translation on -their 



t 



assigned probability will mean that on an average in 100 
occurrences of the item we will obtain 40 wrong inter- 
pretations and translations. This, moreover, is independent 
of whether we use the translation X and Y or the 
translation X alone. In the case that some MT system needs 
to select translations on considerations of probability, we 
would regard the restriction of the translation to just the 
item X as more practical since the user could be warned that 
X contains a certain margin of error: namely, that it may 
mean Y in 401 of the cases, whereas, if translations X and Y 
were used, the user would have to learn that X may mean Y in 
401 of the cases and Y may mean X in 60% of the cases. 

28, W1E GEHT ES I HNEN? MtJi gzht za gut. 

How ah.z tjou^g? 1 am £Znz. 

29, 0/IE GEHT ES IHNEhl? Un4 gzht Z4 gut. 

How aJiz yoUp£? Wz asiz £tnz. 

30, W IE GEHT ES 1HNEN? Ikn&n gzht Z4 gut. 

How claz thzyt Thzy olkz £tnz. 

Examples 28 through 30, moreover, show that translation of 
individual sentences based on the information contained in 
the immediately preceding context is not always possible. 

The disambiguating information may be provided in sentences .. 
which follow the ambiguous, sentence . The argument .that these 
examples could be translated correct lyry.lif they were not given 



in the frequent key punch representation which loses the 
distinction between majuscule and miniscule holds only for 
English. 







28, , 


29, 


Wta gaht as Ihnan? 






How ah.a you? 


30 , 


(Ut a 


gaht as thnan? 




H ow 


ah.a thay? 



For translation into other languages, as for example 
Spanish, we still need to be able to access the responses. 



(28.) Wta gaht as Ihnan? 
Como asta Ud ? 



Mt*. gaht as gut. 
Estoy btan. 



(29, ) 



OJ/La gaht as lh.yie.yL ? 
Como as tan Uds ? 



Uns gaht e-6 gut. 
Es tamos btan. 



It may sometimes not be necessary for device T to have 
access to the environment in the cases where the ambiguities 
of the input sentences can be mapped into a corresponding 
output ambiguity, as examples 19 through 22 , 28, and 29, 
or sentence 31 show: 



31. Johann be.obaahte.te. dan Mann mtt dam T at as hop. 
John watahad t ha man wtth t ha tatasaopa. 



The capabilities of translation device 'T would certainly 











increase i£ it contained a component which mapped input ; 

■. • • ■’ ! I ‘-,7 , - - ‘ ‘‘j T ‘ * r 1 . ° ' " *v’ , ~ ■ rt ‘ ■ : 

ambiguity into corresponding output ambiguity, if 

... 



possible; 



• . . • ■ • • . . - - . * *. • - ' . - ’ : ■' - ' 1 • - > ■■ ’ - :. ■ ■:■ 1 : ; - .. . . . - ■ . . • - ■ ... . .... -. . ’ ‘ • * ■ • . V : ■ . ■ . ... . . 



Whereas this capability may only be desirable, the 
corresponding capability to carry over input uniqueness 
into corresponding output uniqueness is certainly necessary. 
That output non- amb igui ty does not simply follow from input 
non- amb igui ty may be shown by means of sequence 32, where 
brackets indicate that any, but only one, of the pronouns 
in the brackets may be used; the subscript of a pronoun 
indicates that it refers to the word with the same sub ~ript 
occurring in the preceding text. 



32. V£e.6e. Ma-a cA-cne. j kai. e-cncn. At-ommoio-n.^ * Ge.ii.efin <L& i 



e.ine.6 

iA,<L 



if in 
a - 



1 

i eineA, 



RaedcH.^ zeftbfLoch.cn, W£ft t oeftden 



zuftuecki chicken und Eftiaiz veftiangcn. 



A translation' which preserves the pronominal! zation would 
result in the following sequence: 



32a,. Th*Ls machine^ h<u> cl nucZeaft engin&g • Veiie.ftday 

one. o£ j 2 wheeZi^ bftoke, We w£££ iend iij ^ 3 
back and demand a fteptacemeni. 

As we can see, this translation introduces ambiguities which 
do not occur in the German counterpart. The correct trans- 
lation should be:. .. 

A ;*>;•••/* 1 ^ .Vi* ■: •: v vy v -.•:*» ‘vv-: 1 •• •. r~- •• •• • ; ■ *7 



32b , 



ThZ& maehtne^ ha& a, nuc.Ze.aA engtne^. Ve^teAday 
machtne ’ 4 



one. o & the. 

We w-c ££ 4 end the 
a AepZaaement. 



i 7 

eng-cne '4 ^ 

machtne 



engine,, 
viheeL _ 



loheeZA 

1 



bAoke, 
back and demand 



We finally expect from a good translation device that the 
syntactic structure of translation u of some input sentence t 
be isomorphic with or similar to the syntactic structure of t; 
we also expect that the stylistic evaluation of subgraphs of 
the structure of t be identical with the stylistic evaluation 
of the corresponding graphs of the translation u of t , Both 
statements, of course, are to be understood with the proviso 
that such corresponding, similar structures or stylistic 
evaluations occur in both languages. 



So far, none of the examples mentioned have provided us with 
counterevidence to the claim that translation is possible by 
mapping molecular lexical items into equivalent molecular 
items. How shall translation device T react if it meets a 
molecular expression in one language which has no corresponding 
equivalent equimolecular expression in the target language 
as predicted by adherents ;bf the Humboldt- Cassirer- • "/ 

hypothesis, also called Sapir-Whorf -hypothesis? 






' : ■ ■ ■ 



Two solutions are possible: T may contain a dictionary in 



which two or more molecular expressions of the target language 
are given as the equivalent of the molecular expressions in 
the source language or - to quote Professor Bar-Hillel - by 
permitting the system to "tell a story", The first way is 
normally selected in dictionary entries, though very often 
not very successfully, as translations like that of the German 



entry jzmandzm ab&zhzn illustrate. Wildhagen gives the 



translation equivalent ZzaKn something by ZookZng at a pzK&on, 
Langenscheidt, ZzaKn AomzthZng fiKo m a pzK& on .. According to 
these translations, the German sentence £k hat AzZnzK MuZZzk 
dcu Schozn^ahKzZbzn abgzA zhzn would be translated as He 



ZzaKnzd cLaZ.ZJLgKa.phy by ZookZng at. hZ& moZhzK (Wildhagen) or 
He ZzaKnzd zaZZZgKaphy fiKom hZ& mothzK (Langenscheidt) , 
whereas the exact translation should be Hz ZzaKnzd caZZZgKaphy 
by Ma.ZakZ.ng hZ& moZhzK do ZZ. The first dictionary translation 
does not express the fact that there is a causal relation 
between someone’s learning some action or behavior and his vat chin g 



someone do it. The second translation does not indicate the 
fact that this someone is performing the action or displaying 



the behavior, A better translation would consequently have 



;A; V" 



been: Zo ZzaKn doZ.hg x by MatahZng Aomzonz do x, and/or: Wy^yyM)y2 

-:yyy '/ 22 r ■. T-; . ' 



Zo 



Kv;:r " 



o ZzaKn to bz x by MaZahZng Zdmzbbdy b.z x. Assume now 



that a t rans 1 at ion ^ f of V ' tef m yq ; be provided be cause 

y :2y ! ;422i y ..- ■ 2: : : y -2 :.< y 2- . ; ■ >. A: - ■ . - : .>'■• . ' :y2~-yKfi:2ii2 yiyy;22y22 22 

the dictionary - due to lack of any trans 1 at ion equivalent - 









~ ■; _ r-.zf-r.'; r -.-/ ‘i ....... *. 



....... . ........ ... 4 . ..... .... ..... , — , 1 l--r V t - 

does not . contain a translation: for q.- (We do not 






- -i'-'. T ■.’}'•>■ 7' . - . • : . - .' ' 2 ' : ’ . "L‘' ; ' ■ ' - ... ; -:V. 












■ . ‘ - -_to4i ; \ v -’v ■■■v--. a.;-:.: 

■■ . . ' V ; - ‘ ‘ .. ... '/ 



know of such occurrences.) In this case. System T 
needs to be equipped with the capability for describing the 
SA-image representing term q. This, however, can be simulated 
by permitting System T to have access to its meaning rule 
component, where it can read off the definition for the term 



in question. This, again, means that the user of the MT 



system can update the bilingual dictionary by providing as a 
translation the equivalents of the terms used in the definiens 
of the definition of q. 



Real problems will arise only if a state of affairs is 
described in the source language which simply cannot be 
described by any language -me ans in the target language. In 



this case, both human and mechanical translation would be 
impossible. We doubt that this will happen, in particular, 
in scientific texts . 



We finally investigate whether "self-awareness" is required 
for translation device T, This may be discussed by means of 
an example which was given by Roman Jakobson during one of 
the conferences pertaining to the Study. In Polish as in 
other Slavic languages the equivalent of "I" is normally 
omitted, but stated in emphasis . In one of them (Czech, if 
I'Tecalisprp^ti^^fthegopp^ 



of a Polish text: W/ieneutt he. ^poka. o i /t-cma ef ^ , fee ua ed the. 

woJLd VI ' . into Czech should read: Whenever, he. o £ k-cmA e.Z $ , 

he. qm^tt^d the. wo^td 'I V. : . (Note that the translation of Polish 
I_ clym & p<z<z.k*cyig into Czech (I) am (where underlining 





1 



indicates occurrence of the pronoun in the surface; enclosure 
in parentheses, absence in the surface) is not beyond the 
capabilities of the device; this could be handled by the 
semantic or, possibly, the stylistic component.) Clearly, 
the correct translation of such examples requires that the 

system contain the ability to interpret statements about 

< 

itself or part of itself and associate those statements with the 
corresponding parts of that system. The system would thus have to be able 
to 'think* about itself or some of its parts. This capability, 
artificial intelligence, we do not regard as necessary for 
an MT system for some time to come. 



The gravest argument against the possibility of mechanical 
translation has been the claim that knowledge of the world 
and even knowledge of the subject matter is required for the 
translation of a text . This argument, reformulated for our 
device T, reads: There are sentences whose ambiguity cannot 

be resolved by access to the immediate preceding or following 
textual environment. Sentences 33 and 34 may represent such 



33 . Fsizd and John had be,a£e.n Masty and Jane. 4 o 

b fx.u.i.aZ-Z.y . Jtha£ Me. had-. ■£d.'jJtake. -^.hzxn ,.£o ^ p e,na£ camjc 

34. . Ffizd and John had bzate.% Matty and Jane. -4 o 

bJLu£aZ£.y £ha£. toe had £.0 £ake. .them -to a ho a p£jtaZ. 

I t seems obvious that we understand these sentences correctly , 
i.e. that we can determine the proper referent for Jthe.m •.. . 

(necessary , e .g . for their proper translation into Spanish 





£c? 4 and few ) because we have stored knowledge about certain 
typical "sequences of states of affairs” . The fact that, we 
understand these sentences in isolation does not mean, however, 
that MT device T must have the same capability. Very often 
the preceding and/or following context may contain - for us 
redundantly - information which permits the disambiguation 
of such sentences. Consider for example as a continuation 
of sentences 33 and 34, respectively: 



33a. A££e.A t.hAe,e. u>e,e.k4 FAe.d and John we,Ae. ad 

the camp. 

34a. Af \t,e,A thtae itfe.efe4 MaAy and Jane. weiAe. Ae.Ze.a a e.d 
&Aom £ke, h o* pZZaZ . 

Consider even the ".counterevidence" given in the following 
sentences 33b and 34b: 

33 b. Tke,Ke, £he.tf itizAe, & a e, fiAom FAe.d and John. 

34b . TkzAe. £k&y po* e,d no moA<L dang<LA to MaAyand Jane.. 

As we can see, our knowledge about typical sequences of states 

o£ affairs permits us to draw conclusions with some, normally 

high, probability but not with absolute certainty . (This 

probability may... be 100% when the relation between states;- of 

affairs is a cause - effect re 1 at ions hip .) 4 y' ■ 

A difficulty of a different nature is rep resente d by the fact 
that cert ain terms have a different translation dependent on 
the particular subject area they pertain to. 



.. •' 

■ :: rv- ; -v rv“ -;-v. ^ ■>'.■- r:: :■ .\ -"--.■'.v- : r ; vy Y ■:■■■■■ -it-: 

■ = . ; : -v ‘ :v- ; r 



But again, we might expect continuations like: 

35a .. tfe attended eueAi/ p en.io nxnance o£ the tocaZ 
otiche*tKa and Matched the conductor, wtth 
admtfiatt o n . 



35b. A 4 o fiten < 2.4 he coutd, he fiode tn a bu& and 
Matched the conductor. Mtth admttiatton. 



We do not intend to belittle these difficulties confronting 
successful mechanical translation. On the other hand, we 
believe it is fair to point out that no research has been 
performed to find out the extent to which the preceding 

following context provides the information necessary for 
the proper disambiguation for such sentences. We do, however, 
believe that sentences do not occur in isolation, at least not 
in material presented for translating, and that the required factual 
knowledge may be replaceable by access to the information 
contained in the contextual environment. If difficulties 
should arise because the device, instead of printing out all 
readings in such cases, prints out just one with a warning signal, 
we may still rely on the powers of the reader to interpret state- 
ments pertaining to a subject area he is well acquainted with. 



Let us now recapitulate the properties that ; we expect. MT device 

T 1 *4* 1r« n *1 * A m . 









as xn 



their location on the numbered lines in representations 
Figure 12, page 21, can be computed from the syntactic structure 
and the information associated with the lexical items occurring 
in that structure ; this , we are inclined to believe , is 



possible, (Cf, also Fillmore's arguments in "The Case for 
Case",) T will, however, have to contain a transformational 
component which permits at least permutations and deletion 
recovery (for the source language) , and permutations and 
deletions (for the target language) , 

b) It must be able to map the lexical items and the semantic 



relations expressed in t into the equivalent equimolecular 
lexical items and semantic relations of the target language 
sentence t*. This requires either a translation component: 
Source language -> Target language, or an interpretation 
component: Natural language -*--»• Inter lingua, for each of the 
languages involved in the translation process, 



c) It must be able to derive at least one sentence t' with its 
syntactic description from the semantic reading of t*. The 
syntactic structure of t' will have to correspond to the 







This definitely requires aa) the association, component of K, 



bb) the capability not to be restricted to sentence-by- 
sentence translation, and cc) a lexicon in which terms with 
different meanings in particular areas of provenience - which 
are not dis ambiguable by means of semantic features - are 



equipped with area of provenience information (remember 
conductor, in example 3 5, page 45). Device T, of course, 
needs the capability for exploiting such area of provenience 
features. 

e) T must have access to the definitions of a meaning rule 
component. This requirement can be replaced and, for the 
time being, should be replaced by updating the s ource - targe t 
language dictionary by providing a combined translation in the 
target language of the terms of the definitions of the 
difficult' 1 item in the source language; this combination 
can be treated as one lexical item, possibly with internal 



variable slots (cf. examples 6 and 7 in this chapter) . 

In addition, the following properties are desirable : 

f) It should be able to provide a translation t’ for sentence 

t whose syntactic description is identical or similar to f 




are 



isomorphic or similar to structures occurring in the 
input language. 



g) It should, be able to provide a translation t' for sentence 
t whose stylistic evaluation is identical or similar to the 
stylistic evaluation of t . This means T should have a 
stylistic component which can possibly be simulated by 
stylistic features associated with lexical and syntactic 
s t ructures . 



h) T should be able to associate a translation t' with a 
sentence t in such a way that, if t is ambiguous in some 
specified fashion, t* is ambiguous in the same fashion. 

This desired property of MT system T, complementary to 
requirement d, is really a makeshift solution, proposed 
because of the current but, hopefully, passing inaccessibility 
of the information provided by the context to mechanical 
devices , - . . ' / 



i ) Finally, T should be able to produce a non- ambiguous 
translation t* for a non- ambiguous sentence t. 



As we see, an MT device should incorporate a greater part of 



the components bf a comprehension devicevand some additional 



components pertaining to the output in a foreign language to 



provide syntactically simiiat and stylistic translations.. We 




r ,:; ; o.= 




not be relevant for an MT system. 



We conclude that translation by mapping semantic relations 
between molecular or atomic expressions of the target language 
into equivalent equimolecular expressions (or combinations of 
expressions), under preservation of the semantic relations, is 
possible. Such translation can, in general, be performed on 
the level of semantic readings (DSA- images ) . Access to the 
short-span memory, the association component of K , to select 
the proper reading in cases of ambiguity, will be necessary. 

The extent to which access to the association component cannot 
be avoided, or to which this necessity can be replaced by 
relying on the intellectual capabilities of the reader of the 
translation has not been investigated, so far. 

We shall discuss in the subsequent chapter which of the better 
known, current linguistic theories account for the requirements 
that we expect from such an MT device or, at least, the extent 
to which they account for them. 











4 



The Capabilities o£ Current Competence Models or 
The Properties of a Realizable Mechanical Translation Device 

In the preceding chapter we gradually developed the properties 
of a hypothetical MT device T, based, in part, upon the 
linguistic problems occurring in translation which T must be 
capable of solving, and, in part, on certain esthetic expecta- 
tions, These require that T carry across into the target 
language the message to be translated, in a way closely corre- 
sponding to the structure and the evaluation associated with 
the message in the source language. In this v y , we increased 
the capabilities of MT device T until it approximated to some 
extent the capabilities of a human translator. 

In this chapter we want to determine the extent to which these 
hypothesized capabilities are actually realizable within the 
framework of the current better known grammatical models. The 
models we have in mind are: a) the various realizations of 
transformational grammar, as the "standard" model; the "extended 
standard" transformational grammar; and the "universal base 



hypothesis", the transformational grammar with a generative 
semantic base component, b) the case grammar of Fillmore, 




or non-coherent sentences. Competence models are regarded as 
components of performance models which account for such human 



capabilities as the production and understanding of sentences 
in actual speech situations or simulations of them, i.e. the 
production and understanding of coherent sentences. 



These limitations of the capabilities of a competence model 
limit the capabilities of our MT device T. The main require- 
ments which cannot be met in current competence models are : 

requirement d Cpage 47) s the disambiguation of sentences 
based on the information given in the context; 
requirement c. the derivation of sentences from their 
semantic representations; 



requirement e, the production of translations for source 
sentence - target sentence pairs“whos2 semantic 
representations contain equivalent combinations of items 
with different internal molecular! ty, by means of a 



meaning rule component; 

requirement g; the production of translations which 



have the same stylistic interpretation as the corresponding 
source ... l£n'gUag©;£s;©nt'e»#ces..;' .. l 



I t s eems that? the "uni vers a! b 



as e Tryp oth © s is ' 1 



grammar 






those deep structures by means of the rules of the trans- 
formational components of the individual languages. Since, 
however, this model has only been scarcely described - by 
means of a few examples restricted to English - we arrive 
at the conclusion that none of the requirements stated 
above can be met by the current competence models. 

We are thus confronted with the choice either of attempting to 
simulate or construct a component of a performance model 
which permits us to meet requirement d and possibly c 
(we assume we can dispense with requirement g without consider- 
able loss to the quality of the translation) or of lowering our 
requirements for MT device T to make it compatible with the 
current capabilities of the existing competence models. The 
latter possibility is the one normally taken by proponents 
of MT and automatic information retrieval. For MT it means 
that the original definition of translation as an association 
of source language sentence t , with the meaning R(t) , into 
the corresponding target language sentence t ' , with the same 
meaning, i.e, as a mapping of meaning into meaning, is 
changed to a definition of translation as an association of 
the lexical items in t and the syntactic structures which 
i nterpret them with the co r re s ponding lexi cal items in t' and 
the ; corresponding syn taqt id; ^ them 




siwv- ; i-H- vr , 



for , 



In addition , the restricted device cannot account for 
verbal phrases and idiomatic expressions if they do not 
have a literal correspondence in the other language. 

(To my knowledge, Gruber’s proposals have not been incor- 
porated into transformational grammars or any other of the 
mentioned grammars.)^ From a practical point of view, however, 
we can assume, based on experience in translating actual texts, 
that this restriction may still provide generally satisfactory 
translations, especially for languages whose syntactic struc- 
tures are similar. 



What are now the theoretical requirements for such a trans- 
lation process? We need to be able to associate with a source 



language sentence t a syntactic representation, preferably 
the deep phrase marker; we need to map this representation 
into the corresponding deep phrase marker of the target 
1 anguage , and we need to derive from that deep phrase marker, 
by means of transformation rules, the corresponding surface 
sentence t ’ . • " . ! " : : . 



Though the algorithms which perform such recognition ,• mapping 



and p roduct i on have b e en de ; 




........ ., , ^.Sde's!c1^pt:i.ons:S±:orsany'S^^C" 

■ . ■ ' ‘ ' f'.. ... - ■ 

language and the lack of a component which is part of all 

' . ' . • • . . : 




• H *• a 



a) its own syntactic and semo-syntactic properties , and 
tO the syntactic and semo-syntactic properties o£ the 
environment in which it may occur. 

Confronted with this gap, we again have two choices: to lower 
the requirements for an MT system even further by allowing a 
lexical component which does not contain such features , or 



to construct such a lexical component, a difficult., tedious, 
and time - consuming task. 



The first choice, in spite of the intermediate development 
transformational recognizers would lead to systems which 
perform only slightly better, as experience has shown, than 
the ones criticized in the ALPAC report .^ 1 



Thus , really only the second choice is open for a designer 
of an MT device* he has to rely on a complete lexical 
description of the languages that he is dealing with ; he 
has to construct his own featuri zed lexicon or hope that 
somebody e Ise may have produced ' one from which he may be 



able to profit . 




m e 

iii - 
ri-v-A'i'V 






What sort of approximations to the additional capabilities 
of device T can we expect from a restricted hypothetical 
MT device T * which performs mechanical translation based 
on a lexicon with features and a grammatical description 
of the languages involved in the translation process? 

(We do not share Petrick's opinion about the length required 



and the extent of difficulties involved in the construction 
of comprehensive grammars; we believe that his pessimism is 
based on the fact that he considers the difficulties 
primarily from the point of view of transformational grammars,)^ 
Those additional requirements are : 

requirement f (page 48), syntactic similarity of source 
and target sentence structure or at least preservation 
of the relative order of the lexical equivalents ; 
furthermore, 

requirement h, the carrying across of lexical and/or 
syntactic source sentence ambiguity; and 
requirement i , the carrying across of source sentence 
non-ambiguity . 



--V.W- “ • 

----- - v 






The first requirement might be met by establishing additional 
correspondences between the relevant:, reverse Cs ource language) 



transformations and the order ; in which they apply with the 

. - - ‘ > ' ' . 

jj - - ’ “ ... - - - . -- 

c °H es P° n!i;Ln g forward (target language) transformations , 








serve 



items may be easily incorporable into T* and may thus 
as a means to select one translation from a set of trans 
formati onal ly related translations . 



The second requirement would mean that from the translations , i.e. 
the sets of surface sentences : 



= ft', 

1 L- i t j * 



I ,2 



*2 



{ '2,, 
= On,! 



2 ,2 



t 1 

f n , 2 



*'j ,m} 

t 1 1 
2 , kj 

t 1 



.1 

n , J / 



the one occurring in each or in the greatest number of the sets 
A i through A n would have to be selected {where source sentence 



t has the deep phrase-markers DM^ , DM2, 



■ DM n> 



such a procedure would not be practical. 

The third requirement would mean that T* would have to 
generate all sentences generatable from the mapped deep 
structure, analyze each of the generated surface sentences-- - 
again by means of the input component of the target language 
and select one of those sentences which have only one 
deep phrase structure representation. 



thus also relinquish requirement h and the first part of i 

requirement f . (Ihe abandonment of the second part of require- 



V r.-^ -- • - t v - . . ••• * v V \r- ; ; B • - , v^-r_ , \x.Uy * , , ,.>:* v t ^ . v . : _ V-\ ‘ : -V' 

ment ±, preceding page, would possibly impose too heavy a 



burden on the powers of the reader to interpret correctly . ) / vr-c 








/ To 



1) recognition o£ the deep phrase - marker (s ) of source 
language sentence t, 

2) mapping of the deep phrase -marker (s) of t into the 
deep phrase-marker (s) of t ’ , 

3) production of some target language sentence f* from 



(each of) the phrase -marker (s ) of t ’ . 

We assume that such translations' may be satisfactory, 
especially if performed between related languages .In view 
of the problems which will confront such a translation 
procedure (cf. Chapter 3) , we regard MT device T ' as an 
intermediary solution. We personally feel that the model 
which should be strived for is MT device T. in the following 



chapter we shall describe an approximation to such a device T , 
the Linguistics Research System. f 




^ Hv-' Jr i i 
















5. The Linguistics Research System 



"Everything in nature, in the unorganic world as well as in 
the organic world, Happens according to rules , though we 
do not always know these rules ... 



The use of our capabilities also occurs according to 
certain rules which we follow, at first unconsciously, 
until gradually, through attempts and continuous usage of 
our capabilities , we obtain a knowledge of them, even acquire 
such a fluent usage of them that it takes much effort to 
imagine them in the abstract . Thus , e.g. the general grammar 
is the form of a language as such. But one does speak without 
knowing the grammar, one has indeed a grammar, and speaks according 
to rules , but one is not conscious of them. 



Like all our capabilities , our re as oning is sub j acted in 
its actions to rules which we can investigate . " (Translated 
fromthe first through third paragraphs of Kant's Introduction 
to his LogX.k . 3 3 : v • t;'*. T- 








building blocks to form larger configurations. Each component 
consists of a set of algorithms and instructions which are 
executed by the algorithms; they modify the general 
operations of the algorithms in a prescribed way , Such 
instructions are linguistic rules, dictionary rules, syntactic 
rules , interpretation rules; transformation rules, meaning 
rules, mapping rules, connection rules , and others. 

In its basic configuration LKS is a grammatical model for 
the recognition and production of synonymous sentences in 
natural language with identical or different deep structures; 
By deep structures we mean the stage of a sentence derivation 
in standard transformational grammar when all base component 
rules , constituent and feature re-writing rules , have applied 
but before lexical insertions have been performed. -■■■■■■■■ 

The purpose of this model is to associate with each 
sentence in a natural language all its canonical form (KF) 
representations . A sentence which has one semantic reading 
has one canonical form, a sentence which has n semantic 





LRS has the power o£ an interpretative semantic model in 
that it assigns the same KF reading to synonymous sentences 
with different deep structures . It has the power of a 
generative semantic model in that, given a particular KF 
reading k, it permits the generation of all sentences with 
different deep structures with that reading k. 



A canonical form consists of a sequence of connected canonical 
form expressions (KF expressions). The language of canonical 
forms K has the following properties : 

a) Each KF expression is a primitive element of K; 

(it has -for the user - one and only one (atomic) semantic 
interpretation); if a surface terminal k has n different 
senses or meanings, then n different KF expressions or 
connected KF expressions represent the different senses of k. 

b) No two different (connected) KF expressions p and q 



are synonymous . If two surface terminals have one sense in 
common, then that reading is represented hy the same (connected) 



KF expression. 






Numerous statements have been made ill: his tory as to whether 



such a canonical language can be constructed. GounterargU' 

. . " - -■ ■ ’ • , . , _ 


















however, should not be a reason to abandon this notion. 

However, compare Catford: A L^LnguZ&t^c, Thzoay oi T n.an^tatX.o n , 

An EAAay fn A ppZZzd LZnguZ* £Zca , and Hjelmslev: PaoZzgomzna 
to a Tkzoay o & Languag e 

Due to the lack of a theory of semantics applicable to the 
mechanical recognition and production of sentences in natural 
language and because of the immense difficulties involved in 
the construction of canonical forms, LRS represents the 
meaning of sentences by means of normal forms. 

The normal forms of a language are distinct from canonical 
forms of a language in that the lexical primitives of normal 
forms may be both atomic and molecular with respect to the 
canonical forms, for example, baahzZoK^, unmafiKZzd man, 
unmatuiZzd human adu-ti mate . When information retrieval or 
translation from any language into any language is attempted, 
the normal form representations will either have to be replaced ' ^ VU 
by canonical form representations or, more. economically/:.. .... . ...... 

the me anihg rulecomponent will have to be expanded to permit 
the construction of the particular required canonical form 
when logical conclusions have to be found, or when different 

• . . ■ . 

(f ath< 

The process ^o: 






The surface component, the standard component, and 



the normal form component . 

One grammar, the surface grammar, the standard grammar, and 
the normal form grammar, is associated with each component. 

The non-terminal and terminal vocabulary symbols of each 
grammar are complex symbols (except for the terminal symbols 
of the surface grammar) , Each complex symbol, consists of a 
category symbol and zero or more subscript or feature symbols; 
each subscript may have zero or more values. 



The grammar rules used during the recognition and production 
of sentences, both performed as a bottom- to- top direct 
substitution analysis , are generated by the processing 
algorithms by means of instructions represented as context- 
free rule schemata. A constituent in the consequent of a 
rule schema matches every analyzed (WS) complex symbol from 
which it is not distinct, i . e . it may match a whole complex 



is success fully applied if each of the positive and negative 

conditions for each constituent in theruleschema is ^ful-? 3s 

• - - 1 ■ • 

filled by the matched complex WS symbol, and if, all the , 

required relations between two or more constituents stated 
in the rule schema hold between the corresponding complex WS 








The conditions that may be stated for individual constituents 
in a rule consequent are: 

a) A particular category symbol may not or must contain 



a particular subscript or combinations of subscripts. 

b) A particular category symbol may not or must contain 
a particular value or combinations of values . 

c) Operations between subscripts of different constituents 



may not or must be successful. These operations , the set- 
theoretical operations Intersection, Sum and Difference, are 
performed with the values of the specified subscripts. 

Each rule schema of each grammar consists of a syntactic part 
and an optional transformational part. For surface and 
standard grammar the syntactic part of each rule schema 
consists of context-free rewrite rules. The transformational 



part contains only transformations whose structural description 
is satisfied by a string of symbols interpreted by the 
constituents of the/. role schema .consequents The transformations 
possible in surface and standard grammar are; permutations , 



deletions, and insert ions . The transformations are "feature - 




The rules of the normal form component differ from surface 
and standard rules in two respects: 

a) They apply to connected graphs ; 

b) They are not rewrite rules . 

An NF rule applies to all graphs, terminal, non- terminal , 
or combinations of them, whose nodes , labeled by complex 
symbols, are non -distinct from the complex symbols in the 
consequent of the NF rule. The antecedent of the NF rule 
assigns to all graphs to which it applies a particular 
semantic reading, an NF expression, represented by that 
antecedent. Since NF expressions apply to graphs whose nodes 
are labeled by complex symbols , it is possible to assign a 
particular NF reading to a terminal k with a particular part 
of speech interpretation and with a particular selection 
restriction. At the same time , all graphs t-^, 1 2 * * * 
interpreted by the same.. NF expression, k are substitutable 
■ for one anQthery regardless lh fvwhethemfbeltootS-fe^ - 
and end nodes of tj[ are identical : :op^|dif ferenf from^those ibfri 
t . (l<^i , j <n ; i^ j ) . (I t is theoretically possible that t i 



and; t ; have ^ ; j, dent heal -roots ;andVf end noidesyiand^s til^^ 




D E D / E ) . 

_ r ' ** ^ v . = 

* ’ J ■ ‘ ' y '* , v *..v^ ^ 



The normal forms of an ambiguous sentence t may be connected 
by means of "or" links , resulting in one connected normal 
form. Assume that the normal form of each of the following 
sentences is represented by the associated graph. 

I watched a pubZtc ve.hA.aZt a onductosc. 



I L * watch ^ Y conductor j pub 1 i e v'eh 1 cl e 



■F 



— i 



I watched an oH.che&tfia cahdactoH,. 



| 1 * watch ^ u conductor^ orehes tra 



u cl 



I watched an eZeatsiZc co tiducZoA,* 

. I x 



w a tc h ^ S conductor J electr I c i ty 



Then I watched a conductor, caii be represented as 



wa 



«i!^v 



conductor, v z , publ ic vehicle . 
cond 



ton d uctor 



3-* 



orchestra 



conductor, N v 
3-a * 



An "or" - 1 i nkris rep re s erit e d;by a line 






;1 ect r I C t ty 






which" meets 6 r in ters ects 

-a •' * : f-’T-'-t-. /V-rf-: f ' T * t-' ‘ " -.-V' 1 ' ,• •••’ • „ V- 



a labeled line at a right angLe. (These graphs are simulated,; 








It is the function of the surface component to assign to 
each surface sentence t all its syntactic readings according 
to the surface grammar; ambiguous lexical items which have 
the same part - of -speech interpretation are represented as 
one "conflated" lexical item in a surface reading. After 
surface analysis all readings which are not dominated by the 
initial symbol S are deleted; then all transformation 
instructions contained in the remaining rules are executed. 
They associate with each of these readings a tentative 

standard string. Tentative s tandard I s trings consist of 

| ' ’ ’ ' - ' • • 

complex standard terminal symbols ; these may be conflations 
of surface terminals and their ' (pass: 



dictionary interpretation, and dummy 
duced by the transformations of the s 



Dummy symbols represent grammatical morphemes and elided lexical 



items. Elements which were 



symbols which are intro- 
urface rules which applied, 



d is c p nt i hup us 



in the surface are 



contiguous in the tentative standard 



strings . 






These strings are then analyzed by tie s tandard grammar which 
assigns them a standard description and also filters out all 
those strings which are,; not well - formed according to the 
standard grammar. i 




It is not necessary that the roots of the graphs inter- 



preted by the same NF expression are labeled by the same 
category symbol. It is thus possible to define adjectives 
and nouns, e.g, 4 an - 4pzctnum - 4 pectnaZ , as 

synonymous in one reading by assigning each member of such 
P®its the same NF expression. The same holds for adjectives 
and verbs, e.g, bA.Zg hZ - Zo 4 hZnz or nouns and verbs, e.g. 
d Z4 Znu cZZ o n - dz4 Zaoy , etc. It is also possible to define 
synonymy relations between lexical units and idiomatic 



expressions like dZz - kZ c k Zhz buck zZ or lexical units and 
phrasal expressions like 4 ZfiZ k z - gZvz a bZou) - size zZ v z a 
bZoiO or kZZZ onz4zZi - commit 4uZcZdz, etc. In the latter 
examples the actual synonymy relation is established between 
the verb 4 Z^cke and the noun bZou) , or between the verb kZZZ 
with the feature reflexive and the noun 4 uZ cZ dz . The verbs 
gZvz, AzczZvz and cowit are introduced as empty verbal place 



holders ; in addition, KzczZyz is defined as ; the logiZcal Z'Z. ' Z'-i- 

converse of- give, permitting s,uch paraphrases as hit . . 

John, Matty gave John a bZou), John nzazZvzd a bZow finom Many . 

It is also possible to define svnonvmv ati nneiii ne lie 



sible to define synonymy relationships between 
which have an internal variable slot without 



lexical pieces which have an internal variable slot without 



af f e ct in g their tr ansfo rmational 

iilf m 




this may be realized from such generatable paraphrases as 
AZZ mew aAe, not u-tAtaoui - No man t& vtAtuout , etc . or 
He ove,AZooke.d thtt - He, dtd not take tkZt Znto account, 
etc. or A pAeaedet B - 8 bottom A, etc. or A Z 6 ZaAgzA than 
5 - B Zt smaZZeA than A, B t& not at ZaAge, at A, etc. 

How LR5 assigns such paraphrases the same NF reading can 
be found in Lehmann - Stachowitz, 1970, Vol. II, pp . T217-268. 

During production , the recognition process is reversed. 

Each NF expression k is replaced by all the standard rule 
schemata interpreted by k . The standard grammar rules thus 
obtained and only those are used for the generation of 
standard strings in a regular bottom- to- top recognition process 
The combinations of all graphs which are connected with a root 
labeled by the symbol S represent the legitimate s tandard 
readings! all others are filtered out. 



The terminal standard strings obtained from each well -forme d 



standard reading are then analyzed by the rearrangement 

grammar of the language which = 



b) deletes the standard dummy symbols, and 




This basic component of LRS just described is based on the 
following linguistic assumptions: 

1} that grammatical relations can be more easily and 
correctly stated for standard strings; 

2) that surface information is necessary for correct 

semantic interpretation ■ • - • • 

3) that synonymous sentences can be reduced to the 

same ■ 'universal" representation. . .•*> 

This component is part of the Linguistics Research System 
for Mechanical Translation and the-Linguis tics Research 
-System f or Inf of mat ion Retrieval , 



In the remainder of this chapter, we will :CUrsoriiypdescribe ; C ; .y : ; ^" 
those components of LRS which are essential for performing 
mechanical translation of sentences, in natural language. 

More detailed information can' be found in our forthcoming 
report Lehmann - Stachowitz, 1971a, and in Lehmann - 
S tachowi t z , 19 70 , Vol . II. The components of LRS pertaining 



to an information. retrieval, system . are described in Lehmann - 



Stachowitz, 1971b. 

Based on the problems represented in the, examples of Chapter 3, 










tf,ansTg£ ? i^^ 



. - • •: ' ■ _ - - - - • -- - - - - ■ : - ' ■■■■ - ' - ' - ' ■ = 

-f-~i these^ are : * t *■ *■ ; ■ * 




Contextual information is that type of information which, can 
be derived from the speech situation, the belief systems and 
world knowledge of speaker and hearer. Terms also used to 
denote this type of information are: "Pragmatics" , "pragmatic 

information", "socio-psychological information". Co- textual 
information refers to the speech acts that precede and follow 



.•V 



an utterance. In case of a written utterance, co- textual 
information is represented by the written utterances which 
precede and follow the given utterance. Textual information 
is that information available from an utterance or a written 
utterance itself when contextual and co-textual information 
are ignored. 

Translation based on all three types of information we regard 
at present as being beyond the requirements for an MT system; 
the s i tuati on may change , though, once intensive research in 
discourse structure shows the necessity for it. 

The LRS translation system . performs : -translation based on 
^textual from the b as 1 c input component 

and on . co -textual inf o rraation c on tained in the immediate 



environment of a sentence derived by means of an approximation . 
of the : short -span : memory mentioned on page 20 . Chanter !. 

:■ V ; : V V; - . ' - . - - ; ' ; . : .v:;/ " - -.-Vr'.-vv ■,!■■■ V-: '"c - ; ■■ ' ■' ' *'.■ V v ■“ :■ ■■ ■■■■■ ■ :■ . •' ' ■' ■ "■■■■' . ■ 

‘ ■ - • * • ■. •' -■ . ' ' / • ■ . . • _ . 





preceding and following textual environment are properly 
preserved. Thus , we would prefer to translate A ge.ht 8 uoAaui 
as A p-tecede-s B rather than as B fiottoWA A, or A£fe 
&tnd tug e-ndhafit as A£f men vtatuouj,, rather than No man 

t4> not. vtatuou.6 , or A ve-Sikaufit 8 as A 4 eff.4 8 rather than 
8 .c4 4>0<td by A. 

This capability is obtained by means of the fact that 
NF-expressions are represented as complex symbols containing 
essential and accidental features. Essential features pertain 
to properties of an interlingua, accidental features to the 
properties of a particular language represented by lexical 
pieces and syntactic structures. Thus, the various graphs 
in Figure 12 which we repeat below as Figure 16, are all 
representations of the NF-express ion 29. 











The numbers in Figure 16 represent accidental features. They 
permit a more precise translation as, for example, from or 
into the German counterparts represented as : 




d— < 23 ^- 3 - 

2 



r p—3- 



'2 9 > — f- 




P— 3- 




ve rkau fen 



kaufen 



q 

zahl en 




uebergehen 



Similarly, syntactic information like active sentence, - 

passive sentence, can be added by means of accidental features 

to the NF expressions which interpret these structures. (If 

an NF expression cannot be mapped into an identical (i . e . including 

accidental features) NF expression of the target language, all 

NF expressions in that normal form are mapped by means of 

only the essential features) . 

Machine translation is performed by means of the following 
components : ' ’ .... * 

1) the basic recognition component, which derives the 

0'r^ of t 

if t is ambiguous • 

2) the DSA- image component , which represents the normal 

i" • • ■ • •' . , •• T- : \ •' ' . "• . '.y- : ■ .V ' • . ■ • ■ : • , 




lished DSA- image -of t, connects it wi 

- - ' ■ 



e SA- images of 




4) the mapping component , which maps the normal form K 
of t into the normal form K' of the target language; 

5) the production component , which produces, by means of 
the grammars of the target language, a translation t' of t. 



Let us represent this translation process by means of the 
sequence of sentences 38 through 40: 

38, I m MuAeurn Aahen wiA einen LeiteA. 

39, Pen LeiteA a chaute Aick eine aZie Vame an, 

40, Sie zeAbAach ihn. 



The corresponding English translation of the individual 
sentences in the sequence is: 

38a, In the muA earn u) a A au) a ZeadeA [ o_A : conductoA 

( animate) , conductACA A , head, chie&, executive, 
manage a, manageAeAA , pAeAident, diAectoA, 
dtA.eciAJ.ee, a upeAintendent , pAincipaZ, 
conducto A. (inanimate) ]. 36 

39a, An oZd Zady Zooked at the ZeadeA [oa; conductoA 
(animate) , con ductAeA a, head, ahiefi, executive, 
manageA, managcAeAA , pAeAX-dent, djAectoA, ' 

dx.aectAA.ee, a upeAJntendent, pAjncjpaZ, 










8 



q 



35 



1 0 



->• q 



1 I 



->• q 



31 X 1 



33 



12 



■> q 




13 . 



q 



1 4 



->* q 



11 . 



->*■ q Figure 18 



I 6 



q 



17 



“£* q 



18 



->* q 






->vq 



20 



"*• q 



2 1 



^*q 



The digits on the relation and property lines represent 
molecular expressions. 31 may stand for "we"; 32 for "see"; 

33 for "inside of"; 35 for "museum"; 4 for "past"; 8 for "leader"; 
9 for "conductor (animate)"; 10 for "conductress"; 11 for 
head ; 12 for chief" ; 13 for "executive"; 14 for "manager"; 



15 for "manageress"; 16 for "president" ; 17 for "director" ; 

•; fii . - ; yi‘ t - ; y i . \ ■ y: •' 



18 for "directrice" ; 19 for "superintendent"; 20 
' "principal" ; 21?}for "conductor (inanimate) 






s 




The semo- syntactic information associated with the rule which 
interprets the word ■&ahaue,n given in Figure 20 is exploited 
by the transformation instructions associated with the rule 
which rewrites the symbol S. 



C V 

+ PR(0 ' 1 ' 2 * , . v) 

+ TS ( : ’ : * AN ' V . . ) 

+ SS ( : • : ' 2 ) 

+ CG C r ' t ’D rA i . . , ) 

+ TO ( : ! : ’ R . PO , AB 1 . . . ) 
+ SO ( : » : • 0 . 3 ’ . . . ) 



6 ahau. 



Figure 20 



This rule represents all: (prefix-) verb combinations .which 

: ■ ■ ■ • ■ •••••• ■■■■■ :‘.i • ‘ -r. \ 1 -- - <■ : • .'■*■■■• • . ■' ■ .v ■ \ ' ■ • ' . • ' : • . - • ' ■'■■■■■ 1 -'••• ! -•.<= 

^ - ■ * } - ‘ ; - ; ' ■ 

cdhtaih ’the ^ vefbc : 4 aftau&kl The symbol C identifies the 

' • r- : , " ' , 

category symbol VERB; subscripts are identified by a 




■ . --v - , ‘ 

- - ' 11 V '4 

1 $ ■ . . 




"dative", A for "accusative", R for "reflexive", PQ for 

"physical object", AB for "abstract". The •«." indicates 

that the verb takes two objects* a represents logical 

37 

or, the digits in SS and SO represent the order assigned 
to the subject and the objects. (The verb always has the 
order 1, the deep subject order 2, etc.) The value 0 
expresses the fact that the reflexive object in the dative 
is to be deleted. (This deletion is only performed for 
genuine reflexive verbs.) The apostrophes represent columns 
in the "feature matrix" of the verb. 

By means of this information and the transformation 
instructions associated with the nodes in the sentence. 



the following standard string is derived. 








To this structure, the rules of the normal form grammar apply, 
which, derive the following normal form: 

1 2 3 4 5 . ,6 

R(p><?) 3 x T-cwcg x 06* e.fiv<L 0 x Pa&tQ x AJtgum&nt'j x ANV „ x 

( P , q ) Op (x) + 

7 23 2 2 2 U r ‘ 7 

Numbe.Ji {SG) j x lady ^ x OZd Q x A^gumantyx Numb&A (SG) * x { 

■••• ' •• v;i\.Op (a) . ' '• 

„ 5 x s /• 10 . • 

mu&x.a- aondua£.oA-ma£e.Q, mu&A.c- aondua^oa- tf&maZe q , 





The items in script represent atomic or molecular NF expres- 
sions, The information given in light face print represents 
instructions for the DSA component ; subscripts represent the 
degree of the normal form expression, which preserves informa- 
tion about the original standard constituency; the numbers 
above an NF expression refer t.o the connected sub -graph in 
The following figure, which has been interpreted by that 
NF expression. 











Note that the NF expressions represented by the digits 24 
and 5 each interpret a sequence of connected standard trees. 

The normal form of a sentence is processed by the DSA-image 
construction component, which ignores all items which are 
not of degree 0 , or which do not have an operator statement 
(indicated by light- face print) , or which do not have an 
identifier (indicated by a n +") . For each non- ignored NF 
expression, the DSA-image component has an instruction: 



a) every unary degree 0 symbol is represented as *- 



rf 



b) n-ary degree 0 symbols are represented as lenses or arrows' 
(binary), triangles (ternary) , diamonds (quaternary); 

c) other normal form expressions have special 
instructions which have to be looked up in a set of operation 
statements. These representations are connected with objects 

• =. - - " , •}' :■ ■ .... “ts •!>. hiV --.a- • - £ . ■ ... *->•'■ ; ’= .'= 

represented by nodes according to wellformedness conditions r r 
computable from the degrees of the non- ignored NF express ions . 



38 









Let usl>hbw^discuss^ the construction -of ^the ;;DSAx image 
sentence 39 from its normal form representation, The first 



instruction , represented in NF expression 3 , constructs a 



lens with the end nodes p and q and calls the lens by the 

: - 



name of the NF expression given in , the . DSA-image . component . 

: ’ ' 




- The: firs t: instruction Results ; in ^the 



ol.l oki :ik - 



: S':. ".bbv 

#1111 

■ 






■■ -• : ... :v;.- T-. 

s wmmmsM 








Instruction 4 states: Assign the predication "past" to the last 
predi cation . NF express ion 5 states : Replace one of the ‘ > ' in- 

variable node names in. the existing graph by name x^ (the 
order of replacement is dependent on the inherent order of 
the arguments of the predicator . It is reflected by the 

alphabetical sequence of the letters in the graph.) Expression 6 . •• 

states: Attach two "and-branches" to the node with which it 

can be connected through the degree conditions*^ NF expressions ■ ‘ 



2 3 and 22 call these branches LADY and OLD. We have so far 
obtained the following DSA- image: 




Figure 25 









II 



to H'm 



Expression 24 states *! change the next variable name t. 

Thus ,q is changedtoa^. To thisnode.lines representing 
the NF expressions 8 through 21 are attached by "or" links , 









resulting in the graph : 



1 $ 



;'i A:' £ ' "t- i; V} 'i~'\ '[f W - ^ ^ 




....... _ ... ... 

* 1 ; 



The output of the DSA component is processed by the connection 
component whose purpose is to replace, if possible, the 
names of the nodes in the DSA- image by the names of the 
nodes in the already established SA-image. 

The connection component has the following instructions : 

For each node in the BSA- image which is named it gene- 
rates a numerical subscript which has not yet occurred in 
the SA- images, i,e. it assigns a numerical subscript which 
is larger by 1 than the last that was p re vi oils ly assigned. 

For each node named by a^ it performs a search through its 
short-span SA- image and tries to replace the name a^ of that 
node by the name of one of the nodes in its SA-image, based 



40 



on the predication associated with the node a. (We see that 



only node x 2 in Figure 18, page 75 , fulfills this condition. f 1 
When all nodes in the DSA- image have been assigned their 
— ^- proper names represented in Figure- 27, '"- this image 1 n 

^4 s :scbnne cted_ with _jthe_ es tab lished: connected : SA-image , Si. 
suiting in Figure 28. Duplications of predications upon 

obj ects are not repeated •• 

• ' - - 



•* : .-ii: riS'/ST— 




-mb - 

- ' ; ■ . :• ' : ■■ - , 














The processing of the sentence SXa zeAfaAac/i Zkn results 
the following DSA- image: 



in 




re 29 




■•v-’vsa? 



:W 





feminine object. ,,a 2 f ' represents the pronoun *Lhn , which 
can refer to a male object, a non-animate object or an 
object of gender masculine. 



The connection component tries to establish the referent 
for nodes and a.2 > beginning with its most recent SA- 

image , could refer to x^ or to X2 since all of its 

predications meet at least one condition of a,. a^, how- 

ever, can only refer to -Xg. Consequently, the SA- image 
for this sentence results in Figure 30 . 




Its connection with the already established SA- image 



results in Figure 31 in which all the ambiguities re 
presented by the "or" links associated with x, have 
disappeared. 





Zook 



output 

v;. 



where the values "2" and "3" represent accidental features 
carried over from German. This rule results in the retrieval 
of the standard rule (a subset of the surface rule): 



R 39 



V V 

+ PX(O) 

+ TY (-AN) 



+ SS (2) 
+ OB 6 



at ) 

+ TO (PO ,AB) 

. . + S0(3)y ; /^ ...... 

; ■ . ' 

The standard sub -graphs associated with each normal form 

v ' , ' ^ t . 



i by the standa 



the 



re 



r 






iJ v -i 



m : : s ' 



























• , 






mm 



mmm 



V t r iV" 






The first and second sentences are disambiguated by 
comparing their PSA-images against the established SA- 
image. The disambiguation of DSA-images results in a 
removal of the ambiguous NF interpretations for the 
term Lq.ZZz.jl. 



The resulting normal form of sentence 39 is then mapped 
into the identical • normal form of the output language, 
and the graphs associated with each output NF expression 
are retrieved. One of these graphs is the NF rule 



V LOOKAT 
$ A(2 ) 

$ B ( 3 } 



R 39 
+ OB (at) 




s 




The rearrangement grammar featurizes the dummy symbol PAST 
and lexicalizes the feature at, resulting in the surface 
string 5 






M i'fiillliS fi 

A 1 aA\r l *• 



v . - , 


3 Figure 






■ :* o; -V -JT-; 

: Jr: ..- . i 


> 


' y 


.if- ' 


if: i 


V v'rih >! 

it; 

M 

..iff 



V’ 



\ /.:.u 




if 

SlE 




sufficient. DSA-images then only need to be constructed 
for the immediate environment of an ambiguous sentence. 

It may even be possible to restrict the construction of 
such DSA-images to the unary predications of the objects 
occurring in the environment. This decision, of course, 
is dependent on the results of research in discourse 
analysis. (In MT , it is not necessary to establish every 
referent of an expression as it is for information re- 
trieval; it is only necessary to establish those referents 
which help to disambiguate a particular sentence.) 



That the system has the power to carry .input ambiguity 
across can be observed from the fact that the English 
terminal conduc-to*. will be retrieved twice as an equi- 
valent item for Lefte*., once through ao viduc.£oh.-mix*4. c-mafe. , 
and once through cowcfudtoA-Xwftniwiiite. ^ is fairly simple 
to compute output terminals which have several meanings .In 
common with an input terminal (cf, Lehmann- Stachowitz , 19 70 ) . 
However, this will only be necessary if the context does not 
provide any disambiguating information. 



The cons t ruct i on o f -Mb iguous syntactic 
to be performed by means of SA- images v 



structures^also has 
Assume, that the ; 




s en t ence " John w atched a m an w ith a tel e s c op e 
s elite d by the SA- image i iJohn x-^ — 



tmm 






5 



where a stands for "use" 



b stands for "have" , 



In order 



to map this ambiguity across, the system would have to be 









\ 



<■ 

V 

r 

{ 

i 

r 

i 

r 



provided with the knowledge that the structure 

Figure 35 



can be mapped as "with x" and the objects naming the nodes 
have to occur in the surface order z, y, x where x has to 
follow y directly. Such capabilities are those of a speech 
production device which are currently not regarded as being 
necessary for MT . 




The capabilities of LRS are based on the following factors : 

a) its subscript grammar with the feature-sensitive 
transformations; 



b) its normal form component; 

c) its DSA- image and connection component; and most 
important , 

' d) its lexicon. 



The subscript granpar permits us to express in; a rule relations 

and goyernffient wlii ch cdrresp ond to the intuition 




E - : n T/^YY^i-.-rv/ : Hi ■■ pifc V: • ^'-'v •: m ’ 

.* .•/ 'i 



:■ ■ ■ , ■ 



plural j present, past; nominative, accusative; etc. 

We can also express semantic categories like human, animate 
abstract, etc,; stylistic categories like colloquial, vulgar, 
learned; and lexical categories like morpheme and allomorph. 
The subscript grammar permits us to express in a natural 
manner such concepts as gender (with the values masculine, 
feminine , and neuter) instead of representing -it as a bundle 
of unordered binary features as in 

[+masculine] [-masculine] [-masculine] 

[-feminine] [+feminine] [-feminine] 

where the combination 

[ +mas culine ] 

[+feminine ] 

has to be excluded by means of an ad-hoc segment-structure 
rule , 

By means of the subscript grammar rules we can formulate 
redundancy statements, conflate ambiguous trees into one tree; 
we can also update the lexicon by adding additional necessary 
semantic features to it without having to make corresponding 
changes in the syntactic rules interpreting them. 

The transformational component permits the-l-dis ambiguation of 




w'ip- 



The normal form component assigns an NF expression to 
(connected) syntactic subtrees, and to lexical subtrees with 
a specific set of semantic features within a specific semo- 
syntactic environment. It is also able to assign a semantic 
interpretation to verbal phrases and idiomatic expressions 
with or without internal variable slots and to map these 



NF expressions into the corresponding NF expressions of the 
output language without affecting their transformational 




where /fail/ represents the morpheme &a.&£ (the actual 
allomorph is generated during the rearrangement stage) v 

NF rules are currently the only rules of the meaning rule 



compohcnt of LRS ; we are planning to extend this component 
to include 







The DSA-image component and connection component permit the 
disambiguation of an ambiguous sentence by means of its 
co-textual environment". 



All the capabilities of the components mentioned would be 
ineffective if it were not for the lexicon which has to a 
large extent already been constructed at the Linguistics 
Research Center. The LRS dictionaries contain stems > 
inflectional affixes (and, for German, two types of derivational 
affixes: separable and inseparable prefixes) which are concate- 
nated by means of the surface word grammar rules. 



These dictionaries are currently being updated by establishing 
for each stem 

a) its syntactic and semo -syntactic properties, 

b) the syntactic and semo-syntactic properties of the 

environment in which the item may occur with a particular 
meaning. 7A.y-.> A-V 

Polysemic terms are thus represented as one term. The system 
of rules ; ■ . v yi - f ^ vr v ; ' ™ ^ v . ^ .i. / ’ V ~' A : 



; R 3 . C 

+ TY(H,IN) 

: ’ v :i •• ~y ■} '■ * " ’ '.i}> : v V- •; -v ;: 4 -C'y'U Hi EhME 4 



'page 



ag - , : ; • surface:’;; ru 1 e • . p -$ •% ••• 






interpreted as HOTELBOy or BO OKP AGE , or both, in environments 
as The page, itept or The page toae or He touched the page. 



respectively. • 

Lexicographic work at LRC (c£. the appendix, for details) 
has already resulted in word lists containing 

a) 10,000 German verbs and 10,000 English verbs , 
both classified with respect to their object complements 
about 2,000 entries of the latter have been classified with 
respect to subject and adverbial complements . Similar work on 
the German verbs is in progress, 

b) 33,000 German nouns (letters A through K) with about 
70,000 English correspondences; the first 7,000 of these 
German nouns have been classified according to the scheme 



shown in the appendix ; 

c) 6,000 German and English verbal phrases (verb-noun 
phrase and verb -prepositional phrase combinations); classified 
as to sub j ect , ob j c ct , and adverbial complement. 5: ;• i ; i. ; 

Work on adjectives and adverbs is beginning. 

. \ Future additional lexicographic work at the Center will be 
\ \ ' - ‘ . . • • • 

/ • directed towards the establishment of a minimal set ‘of ^ ' 

/ \ ■ ‘ - - ' - • ' ' • ■ - . 

; ^additional s email tic features in order to disamb iguate ve rbs 

V.: • .. Y..' ;! ‘ ’V-\V . v :Y . /■ • . ' " ; Y , = / .YYi-Y Yl -j . .- V = 





we also plan to reduce the size of the surface dictionary 
(projected number for German = 80,000 entries, for English - 
100,000 entries) by removing productive derivations and 
compounds from the dictionaries. This will be performed by 
adding derivational affixes to the surface dictionary and i.-;. 

word formation rules to the surface word grammar. In order 
to facilitate the design of the necessary word formation 
rules for German and English, programs are presently being 
constructed to analyze and display in concordance format the 
analysis of each of the individual entries in the current 
surface dictionaries by means of the whole surface dictionary 

■ . ' / - ‘ ' • >.’•* r- l 'r - 

(to which all derivational affixes of the language have been s’v 

previous ly added) .. 

■ • - ' • ' ' ; ■■ - • 

The listed components, in particular the complete lexical . Ki 

• v : : v --::n 

component , give LRS to a great extent the power of the 
hypothetical trans iatidridevice T (pages 46 .“^'through 49) . 

LRS can meet the requirements^ a through g ■: 

a) derivation of semantic reading R : for sentence t; 

b) mapping of semantic reading R into semantic reading R 1 ; y 

c) derivation of sentence t ’ from, semantic reading R* ; 

• . - - - , V 

d) disambiguation of t in context; / -• 

resembles the syntactic ’ structure of’ t j / 





5 

Though LRS permits the carrying over of lexical ambiguities 
(requirement h) , we feel that this will not be necessary 



because of the ability to disambiguate in context. 
Requirement i): carrying across of non- ambiguity of t into 
corresponding non- ambiguity of t ' , can presently only be 
obtained by re-analyzing standard string t 1 . This we do 
not regard as practical. Carrying over of non- ambiguity 
could be guaranteed by adding diacritics to t 1 which 



simulate the labeled bracketing of t ' . However, this may 
not be very convenient for the reader. 

Apart from its applicability to machine translation and 
information retrieval, we assume that LRS also provides 
reasonable explanations for a number of not easily exp la 
linguistic phenomena , as for example the occurrence of tile 
underlined .the f s in- the sequence ■ — 

41. One. o fle.mbx.andi ' a p£c£uxe.A wa* a o£d y e* £e.xday . 

42 . The -6 e.tZe.x uoaA ve.xy happy ooitk ihe. pxZae. 

43. The b u y e.x L-6 pxo babZy an Am^xJLaan . b ■ 44^4^. -44 4:: 






:■> V b 



If We rep re sent then's entenc e 41 by the 

s . . V bb. 



: : b " -• ' i 'y ; fj£ '/? ■ ' ' 

■;SpOp : *- V.b- J >yb;-bbb. 

:\C’J 



•0 

f . rl-^4 ;. -’>••• -V-- y " V* fe'b \{,r : 

f 4 



■S-V-- c V 



' . •. •/. ?,v- - by v -1, ' / \ ? - _ * ‘ ‘ ’ . ybb'b’ bb b a \ V : :b ; ' - 

\ * . - - . i . * - p* i— e l ly -£ ->vV - •_ - • - 




we can explain the occurrence of the definite articles m- 

' * r - ‘ - ' ' - . ' ■; - 



the sentences 






43 'hv i-b 



... ... -v-y; -y—= /< y; ’ • : 

fit: Jfa t V they' 61 ‘‘ i - * ' ' vi ; ' " ; 



- • t • ■. . .'..tv-; v -r-i ;• - y 



■ -■ : -b'V- be,. '■ - -b- V T by:. , -'-V,:.- : • • • -v./ ••• 



- ■ ■ ' . ■ ■ '■ v ;• . , b-/::.,-; ■_ 






refer to (p , r and s) have been implied though not specified 



in sentence 41. 

We can also reasonably explain the following "paraphrases" 
of sentences! 

A and E kt^&zd. A and 8 ktAAzd one anot.kz.fi. A and B 
gave. kt6<6Z4> to onz anothza. A and B zxckangzd kt 66 z&* 

which have complete or partial correspondences in a number 



of languages such as French, Latin, Serbian, Hebrew, German. 

i 'ktfi *6 



Let us represent A and 8 kt^-izd by 
also be read as: 

A and 8 ktAAzd onz anotkzJii A 



8 (which can 






fet.6 6 



B) 



The nominalization of kt4-& results in the following diagram 



8 



In order to establish a relation between the three objects, 
a diagram like 

kt-& & kt-&6 

or ; 





a ktAA to 8 or 8 gtvz4 a k.t&6 to A . 

Since the kiss that B gives to A is identical with the kiss 
A gives to B, we need to extend the graphs to 












• -.V. 



... , ..... - -v* •• :r ? ; 

\ y';. - - .c ,/ , •. t .-Vr^-y 

•'v, V v V*:- ; ' V - - Y''- : . • y -• '’"i 



' Yv>-- 

. „ ■■ i-d}- ...vivVi-, ' i Y ; - 



YY ■; YY-V d'-Y :■ i •' .... .. 

rv Vv. Y Y ^ Q ..Y’.Y Y' Y.: ;Yv-" Y 



■: ;y 






• ■ .Y 



Yi; vyyYL.v- V yyYy- 'Y-.- Y;.cY:-:- 

■ Y: . Y- • : ■ -r.-5_-._-- - ■_ . 

Y' YY; Y; MYY YYS/Y YYYYY YY : 



• . • ■ ’ ‘ . •: •• *.'•*.*• 

:r : ' ' ■ y. jlv-'v ;Ay.. vyy.jy,;.- A;- 

~ | — 1 - ....... • ,• V" -'-- .. ■" 

: ; v'., Y-,v .-vi, :y. :..v- -■ - V;----:.’ --v*r ' 

' JLO;* 

■ ‘ 



fef.4 4 




The resulting diagram, as we may observe, is similar to the 
diagram for "sell" in Figure 7 (page 15) , where one of the 
conditions for the equivalence of the given objects , namely 
money, has been removed. This exactly describes the actions 



involved in an exchange of objects thus permitting the inter- 
pretations: A gtve.6 a to 8 and (simultaneously) 

B gtve.& a, kt&A to A or A and B gave, ktAA c-i to one, anoth < lh, 
or A and 8 exchanged ~ (or : a fef 44 ) . 



LRS , as we observe, is a complex configuration of components , 
actually more complex than described in this paper. This 
complexity, of course, is due to the complexity of the 
processes occurring during speech recognition and speech 
production. The question , however, that naturally arises is : How 
efficiently ,1, e . how inejpens iyely * can mechanical tr.ans latipn be 
performed with LRS? We will try- to answer thiat question in 



the next and final chapter 















•• iv. V.; • • 



‘ :- : "v ■ v •. 1 r; /; tv;./;'' ;;v ; Tv V' v> TT;- . -■ v ;; s .v; f.r- WV /• ?r-x y v T : '■ . • ^ ■ TV':.’-, 

,f ! :: : \. : 7'T ! .v ' v* ‘ ‘ V/ ■' i , 7 . » . *" 4 ~ " * * 

~ • . - i - - * -‘V - ■ 












VTVv vTv.'vvv.rV 1 



•.U : -TV.;-r 



^ ..ti-ir' a iT-V T*:,'- -:V' cr A rr . Tiivv;- : ;,v; ‘r 

■ - j 1 J ' 



Vw; 'v- v ■■ •••• • '• -'.''V'-rVi'-ti. 
V-f;;'- T . WVi. 



I - a’ V,-; jC.i.':. :.v>y.-. . V:'-v . n-'T v - ‘ 






. ..v '•■■■ ■ 

. i'..: rj ? . .• 

!■ ' ' ■ "■ , t; ■ , i' ' ' ■■ : ' ••• ' ; *•/. 

£ 



O . ‘ ■ , -■ ' ■ vV/-:7. : f:'‘. ; ; - CvVv v^:;3v' V: ": 



6 . 



Progress in Hardware Development and the 
Future of Machine Translation 

l The criteria according to which the feasibility of machine 
translation is normally evaluated are: quality, speed, and 
cost. In this chapter we do not want to deal with the first 
of these criteria: our demands on the quality of MT output have 
been stated and the quality of such output scan really not 
be evaluated before the output exists. We also want to ignore 
speed, since speed is a factor which is normally used in 
favor of machine translation. As to cost, we want to 
restrict ourselves to costs arising from computer processing 
and exclude those costs which might arise through pre-editing 
and post -editing (though not in LRS, which is conceived as a 
fully automatic MT system) and key -punching of a text. 

Cost of computer time is dependent on mainly two factors: 
the actual use of central processing time and the use of 
input- output time. That the central processor can work with 
immense speed is generally known; it is less known to the 
non-specialist that input -output operations are by many 
orders of magnitude slower than the speed of the central 
processor and that the central processor must stop with its 
computations for a particular program until the input-output 
operations for. that program are completed . 

Machine translation is a process'; which requires almost . .. 
cons tant ’ input -output operations .: We can visualise the ■ 




performance of a computer during Machine translation by 
imagining a human being A who 'reads a text according to the' 
fallowing conditions : A has available different kinds of 

information* \ 

a) a dictionary ^consisting of a number of separate booklets 
which contains all paradigmatic, syntactic, and s emo - syntactic 
information pertaining to a word, V 

f. "A 

h) a grammar which also consists of several separate 
individual volumes , ^ 

c) a dictionary 'of word definitions or meanings consisting 
of even more s epar ate) volumes than the paradigmatic dictionary, 

; d) a semantic grammar in several volumes which contains 
the interpretation rules necessary for the computation of the 
meaning of a text from \he lexical items and syntactic 
relations . \ ... 



A h 



as to read the text wold by word. He may only continue 



yith thp next text word if he has found the word that he 



fs currently looking at in ohe oflthe parts of the paradigmatic 
dictionary . Actually , it mus £ be in that part of the 



dictionary which he is holding in his hand. If the word 
/occurs in that volume , he may proceed to the next word. 

/'. not, he has to put. this volume down hand pick up another A • 
volume and check whether theriordi occurs in if;. By means of 
an efficient search procedure he repeats this process until 
he finds the volume which contains ; the word. He then looks 
up the word and writes’ down itsVpart of speech interpretation. 
Then lie. proceeds with ijhe next woVd . To speed up his per- 



formance ? A keeps the volume which he is currently "processing 1 ! 



in his hand as long as possible because it might be th 



l© case 



that the next word that he reads aZso occurs in that part* I 



n 



reality, to decrease the number of volumes of the dictionary 



A is not reading whole words but constituents of words When I 



A has looked up and written down* all the paradigmatic information 



associated with each word constituent, he begins processing the 



text again, beginning with the/first word, this time consulting 



his grammar books. The procedure is repeated' in a similar fashion. 



' / - • ' 

Then A starts using his dictionary for semantic analysis, and so 



on , 



The picking up of individual volumes and putting them down again 
represent the input-output operations of a /computer whose central 



memory is simply not large enough to hold several volumes of a 



ice 



dictionary , or even the whole grammar, s Inc© the memory must als 



hold the programming instructions and tl^O results of the compul 
tions . 



The advantage of the LRS subscript grammar G' is that it represents 
an abbreviated edition of a multi-volume grammar G r . (Some of tls 



1 



subscript rules represent hundreds, i few evOn thousands of formed 

S O js > 



context-free rules with simple a; 



) Th e info rma t i on ' i n 



grammar G permits the computer i/to compute th,e information contained 
in Grammar G ’-A and only that information actually needed for the 
analysis of the particular text sequence currently being processed. 
And we recall that- a computer can ^compute with extremely high - , A 

.. , ... - :.y, • • . ■... • •; • .. . .. 

'speeds . ' :: k'- .. 



fo 8 



a; . ... ■ • ■ ■. 



. {: ' ' - ' .* 









: ' ' .. 



.I'- 'P -yl'.-:-''.'-, - ..iC-: .. 



'W-tv-.. 






In spite of the advantages of the subscript grammar, we observe 
that the problems pertaining to the recognition of the dictionary 

items are not alleviated by means of such a grammar since the 

/ ' * 

number of dictionary entries is a given number which cannot be 
changed. (The conflation of dictionary items, possible with a 
subscript grammar, still doe? not change the number of entries.) 

Fortunately, a development in computer hardware is in the offing 

i 

which will have decisive effects on machine translation and 

\ ; 

other research areas which are forced to deal with large data 
hasps: the holographic memory. (Cf. Peter L. Briggs: ''Holographic 
Memories Could Make Others Obsolete", Part IV of "The Great 
Memory Debate" in Computerwor Id , August 26, 1970 , page 44.) 

"Researchers now working with holographic memories claim that 
hol< 



one holographic memory the size of an average office desk will 
hav|' the capacity of all on-line storage in use in the Western 
world. " and that "The desk-size holographic unit, with several 
100 trillion bits of storage, would exceed the capacity of all 
of ; the disks , drums, and core memory now in use ..." 



(Holographic memory) "will offer users multitrillion character 
storage at ... prices probably less than one-thousandth of (the 
current, price) for large - capacity disk s torage . " 

x The information in such memories- can be accessed with the speed 
ofy light ; "access times below 20 nano -seconds /per character or/ 
word or/ vihatever (will be) feasible Wi thinjStf i vie years . It is y-. 
possible tlivt such memories may be sufficiently faster than 
(the ? processing speed of) the best central processors , that they 



* * * 



or several thousand 



can efficiently serve several large CPUs 
terminals at once,” . * « "Users have indicated that they really 
don T t have any idea what impact unlimited memory might have on 
their DP (data processing) "applications and system designs , 
but they all agree that the whole way of using a computer ought 
to change when the storage of data is no longer a factor , and 

when the access speeds are as fast as the central processor, 
itself, h48 



The conclusions for MT are obvious. The speed and, consequent ly , 

the cost of machine translation can be considerably reduced 

\ ' J ^ i 

because all the dictionaries * syntactic rules, 'semantic rules, ■ / 

. . \ / 

etc., even the processing programs can be storedNin a part of a r 

holographic memory. The problems which remain in \he production 
of workable holographic memories, namely to make them erasable,! 
are no real problems for an established MT system since it willj 
be able to operate with a read-only memory. Changes a*iid 
additions to the grammars which will be necessary because of 
neologisms th at are introduced into a 1 anguage can always be 
stored on disk and be read into the central memory before, 
translation is performed. \ 

In our opinion the real importance of such memories lies not/ so 
much in the increased speed with which data processing can he 



- ' / 
/ ■ 



performed, but in the completely new methods ’of processing 
data and solving problems that such memories will permit . 

The various models of human performance that' have been constructed 

in the social sciences: sociology, economy, etc,, normally reflect 

" 'Vv^'r'V - - 510 ' : ' 



eric: 






• •,/ 



in some way the way we are accustomed to talk about a subject 
matter. In linguistics we are accustomed to talk about sounds, 
morphemes, words, syntax, semantics, and even about context 
and pragmatics. Linguistic models, however abstract, in some 
way reflect this way of our talking about language. Thus, we 
have hierarchical phonological, syntactic and semantic "levels" 
in some models, and phonological, syntactic and semantic 
"components" in others. The effect of each component or level 
is twofold: 

a) it assigns to the data an interpretation according 
to its instructions, and 

b) it eliminates those interpretations which were well- 
formed according to the instructions of previous components 
but which are not well-formed according to its own. 

Holographic memories may change our way of constructing models 
which is based on 19th century investigations and considerations 
(John Stuart Mill);, according to those we assume one, or a few 
variables for the analysis of a complex phenomenon and keep 
all other factors invariant , The fact that we speak of 
several levels or components of "language", like phonetics, 
phonemics , morphology, lexicon, syntax, semantics , pragmatics , 
etc., has not been imposed on us because of the nature of 
language but because it is; easier for us to treat individual 
phenomena by ignoring certain others , especially if those 
others are very complex and really not quite unders^ ;od. With 
the capabilities of computers expanded in such a way , we can 





finally begin to re-introduce the total approaches (ganz- 
heitliche Methoden) by mentioning the conditions for all the 
variables that we know. 

Now, what does that mean for machine translation? Since the 
projected access time of such memories, about 20 nano-seconds 
is shorter than the time needed for' a minimal basic computer 
operation, it means that such a memory can be read by several 
computers "simultaneously" . / 

We could thus theoretically construct a machine translation 
system in which one computer performs dictionary analysis; 
one, word analysis; one, syntactic analysis, etc.; one 
computer for each component of the system. The intermediate 
output of each computer could immediately become input for the 
next "higher" computer, which again would give its output to 
the one above" it, etc. At the same time, each computer 
could return the results of its own computation to the 
computer working directly "below" it in the hierarchy. Of 
course, we are not seriously proposing a system consisting 
of several computers to perform machine translation, but it 
■*' S generally known that we can simulate on one computer the 
performance and capabilities of several; computers, Je can 
thus write programs which no longer analyze the data in a 
hierarchical "horizontal" fashion but in a hierarchical 
"vertical" fashion,- which is the way the human brain ’operates 
during the understanding and production of sentences . Nobody 
would seriously assume that semantic interpretation is 



performed over the output of some type of complete syntactic 
analysis represented by a tree with the root S, If that were 
so, strings of words like those underlined in the following 
s equence : 



Qe. 0 J 1 .ge. .saZd: A£ZeZt T /tad , , . A.6 u~6ua£ fee aoutd no £ 

fiZnZ&k hZ* 4e,Rtence because, Ua^ty ZnZe.AsiupJte.d hZm, 
could not be understood. And that we really understand 
sentences sequentially is clear from many observations, like 
the following: During a conversation between two people A 
and B, B explains some matter to A and hesitates, grasping for 
some word that eludes him* A provides the missing words and 



continues the sentence for B. 



It is perfectly possible that mechanical translation performed 
with such "vertical" model will approximate "simultaneous 
translation"; that, while the system is still processing 
source language text on the input side, it is already producing 
target language translations on the output, side , 



X may be overly optimistic when X say that eventually the cost 
of machine translation may depend on two factors : 

a) the speed with which the source material to be 
translated can' be read into the computer, and 

b) the speed with which the translation can be printed 
out by computer. 

Holographic a memories will provide us with the technical 
capafci.li tviii -to construct mpdel£ which are to a high degree 
representations of the reality, which surrounds and which ■ 





affects us. They will provide us with the means to test 
our hypotheses, and, if necessary, to modify or even reject 
them. It is our task to be prepared for these possibilities 
by performing the necessary research, by collecting the 
necessary data. This task will not be easy j it will also 
be expensive,* but eventually it will be rewarding, not just 
as an "intellectual exercise" but as a means to understand 
ourselves, to become an integrated part of a cybernetic 
society. 




FOOTNOTES 



1 There is no need to deal in this paper with certain 
claims according to which these disciplines are 
actually sciences . 



2 C£. I.M. BochetCski : Die zeitgenoesslschen Denk- 

me thoden , Dalp-Taschenbuecher , Bd. 304 ; Lehnen Ver- j 

lag, Muenchen, 1959 (2). ! 

3 "Die schematische Durchfuehrung eines vorgegebenen j 

allgemeinen Verfahrens bietet (nach einigen Proben) j 

offenbar einem Mathematiker kein besonderes Interesse, ] 

Wir koennen also die bemerkenswerte Tats ache fast- l 

stellen, dass ein schoepferiseher Mathematiker _ durch ? 

die spezif isch mathematis che Leistung der Entwicklung f 

einer allgemeinen Methode den durch diese Methode be- j- 

h errs ch ten Bereich gewissermass en mathematis ch ent- j 

wertet." Hans Hermes: Auf zaehlbarkei t , Ents cheidbar- j 

keit, Berechenbarkeit , Springcr-Verlag , Berlin, 1961 . j 

The translation of this passage provided in the English j 

translation bf this book somehow does not reflect the j 

author’s statement, j 



4 Charles J, Fillmore: The Case for Case in: Universals 

in Linguistic Theory (eds . : Emmon Bach and Robert T . 

Harms) 7 Holt, Rinehart and Winston, Inc., New York, 1968, 

5 C£. John Lyons: Introduction to Theoretical Linguistics , 

Cambridge University” Press'," Cambridge " l96s. 



6 Personal communication with Reed Bates and Emmon Bach 



7 This principle is most ofteh used in dictionary definitions 
where the meaning of the term defined is a common subset 

of the meaning of the words linked by "or" in the definiens 

8 Cf . Peters ,- P. Stanley and Robert W. Ritchie : "A Note 

% on the Universal Base Hypothesis" , Journal of Linguis tics , 
Vol. '5 , 1969 and "On the Generative Power of Transfor- 
mational Grammars", to appear Jin Information Sciences , 

It is surprising how little impact their results have 
had on the linguistic community, so far.yiiFot the only 
exception - to my knowledge - cf. .Emmon Bach: "Syntax 7 

since Aspects " (paper given at the Georgetown Roundtable 
Conference, March 19 71) . ' " ! ' 








9 



10 

11 



12 

13 

14 

15 

16 

17 



18 

1.9 




L£. the publications in the series : T ransfo rm a t i o n 

and. Discourse Analysis Papers , Unive rs i ty of P ennsy 1 vani a , 

Performed in spring 1967 and described in Lehmann- 
Stachowitz, 1970 and Staehowitz, 1971* 



Clearly, commands, requests and questions might be 
reformulated as statements, as for example "Someone 
orders that S", "Someone requests that 5", "Some re- 
quests a statement S(x)" such that the variable x is 
replaced by a constant, Where x represents the 
questioned element. in a sentence, as in "Where are 
you going?" or by an affirmation, negation or modi- 
rication of certainty or uncertainty as in "Will he 
come?" "Yes", "No", "Maybe". "Possibly", "Maybe not", 
etc. We do not have such a reformulation in mind. 

We argue in the next paragraph of the text that a 
sentence evokes an image of something. This "some- 
thing" we want to call a state of affairs, 

j>i stands for: The point of time represented by 

2 is later in time than the point of time represent- 
ed by i. 



Lines which extend from a node represent predications 
joined by logical and . 

Clearly, this is a simplified version of the meaning 
of "sell" (A,B ,C »D) . We ask the reader to accept our 
definition. 

Line 7, representing the property "physical object", 
may be omitted from Figure 10 if we assume a meaning 
rule component which contains the meaning rule "For 
^ > if x is a car., then x is a phys i cal ob j ect" , 

The value for n will have to be determined experiment- 
ally. ' v- ? r."'V'v - “V . 



I f the equivalence relation between "sell" and "pay" , 
and "sell" and "pass" is not regarded as appropriate, 
the sign for equivalence may be replaced by the sign 
for inference . v " 



Ternary relations are represeiited by a triangle 
binary relations by a cross-section of a lens :• < 

R®^yitementS :C; and ;d are possibly too s trict to 
represtorit- actual speech production. 









20 We are ignoring in this representation the various 
time relations as expressed in Figure 7, 

1 2 1 A leaflet handed out by one of the University of 
Texas at Austin student groups in 1965 contained 
as the only statement: "Students should have a 

voice in decisions that effect them". We assume 
that the system as well as the reader of this 
footnote automatically interprets "effect" as 
"affect"; the system would do this because it 
becomes "aware" of the absurdity of the statement 
as it stands, in contrast to the reader, who, 
normally, only becomes aware of it when the 
printing error is pointed out. (I owe this 
example to Professor Norman Martin of the University 
of Texas at Austin Philosophy Department.) 

22 To be exact, the terms "referent", etc. only refer 
to the objects which are "involved" in states of 
affairs . 



23 We are using the term "synonymous" as a substitute 

for the term "equi- iconic" , which to define would 
be a further digression; for this term cf, Lehmann- 
Stachowitz, 1971b. . , 

24 We exclude from this judgment the works lof J.A. 
McConochie Simplicity and Complexity in Scientific 
Writing; A~ Computer Study of Engineering Textbooks . 
bd.D, dissert at x on, Columbia University, 1$69, and 
M. L . Gopnik , Linguistic Structures in Scientific 
Text, Ph.D. di s s e rt ati on , University of - Pennsylvania , 
1969 ; both authors have arrived at results which seem 
to indicate that the language used in scientific 
texts is indeed a simpler subset of the regular 
language, 

25 A stylistically correct translation would be "He 

.. . 



.26 



2 7 



The actual percentage is lower since we considered 
only eight verbs of 15 verbs occurring in that 
passage . The text, , ■ though originally selected at 
random; is, of course, too short to count as a re- 




Wildhagen , Karl and Will Hdraucourt , English- German^ 
German -English Di ct ionary . Vol. XI German -English; 
brands tetter Verihg, Wiesbaden, 19 53, and Heinz 
Me s s xnger : Langehs cheidts Handwoerterbuch Deiit^ ch - ^ 

Langenscheidt KG, Berlin, 1960 (2) . 

51.7 



Englis ch , L 







28 Such an assumption would, of course, mean that there 
sre certain human beings which have learned and can 
express certain things in their language which no 
speaker o£ another language can learn and express. 

We regard this as impossible. 



29 



Gruber, Jeffrey S., Studies in Lexical Relations . 
Ph,D, dissertation, M.I.T. /“Cambridge, September, 
19 6 5 . 



30 For a comprehensive description, c£ , S.R, Patrick , 
"Syntactic Analysis Requirements of Machine Trans- 
lation", IBM T,J, Watson Research Center, Yorktown 
Heights , 1971, 

31 Automatic Language Processing Advisory Committee 

1966* Language and Machines : Computers in Trans - 

lation and Linguistics . Publication 1416. Washing- 
ton, D.C., National Academy of Sciences, National 
Research Council, 



32 Petrick (up- cit.D 



33 Immanuel Kants Logik, ein Handbu ch zu Vorlesungen 
Tnl Immanuel Kant - Werke xn zehn Baenden THeraus - 
gegeben von Wilhelm Wexschedel) , Land S, RTissen- 

s chaftliche Buchgesellschaf t , Darmstadt, 1968 
(pocket; book edition of the ICant -Studien&us gabe) • 

34 Catford, K,C. , A Li nguistic Theory of Translation — 
An Essay in Applied Linguis tics , London. Oxford 
™ n i ve tsxty Press , 196 S , published as volume 8 in 
the series Language and Language Learning , R. Mackin 
and P.D, St reyens (eds . ) and Louis Hjelmslev, Prole- 
gomena to a Theory of Language . Baltimore, ISSTT" 



3S This is necessary to insure the eventual well-formedness 
of the standard string. If more than one String 
should result , those which most closely correspond 
in their accidental features to those of the input 
sentence t can be selected. 



36 We have taken: these examples from: Langenscheidt 1 s 

German -English dictionary, cf* footnote 27 . 






37 The comma has a stronger binding power than tile 



: ; . ^ y-- -v. ^ r-. - ; ’y: 



, ••• • •/ u V ; ' ' ; 












!u.. v 



38 We us e the arrow to refer to abinary relation 
' which is nomxnalized. • y; 

■ , - ‘ . = t • ■ . . 'V ‘ . r ' ; , ■ - . - • i - ■ r ■: y. ‘ . - ‘ * •• . •** . ‘-t- , j . 

51 Q . * ' ' -- 


















-y-m. ^ && m 



V . T. r ’. ; j* '. 



39 An "and expression" attaches two lines to a node 

if it is not in the domain of another "and expression"; 
one branch, if it is. 

40 The terms "a" and "the" have really several operation 
statements associated with them, interpreting such 
sentences as "A whale is a mammal" , " The whale is a 
mammal", " The United States is a country", and "When- 
ever John rides a bus, he starts a fight with the con- 
ductor", 

41 The NF expressions contain the semantic features of 

the interpreted terminals of the language, which permits 
the disambiguation of the predications upon x,, , 

42 We treat proper names as predications for two reasons : 
They may refer to more than one object; certain 
semantic features, like human, male, female, are 
normally associated with proper names, even size, as 

e . g . "Haenschen" (little John). In our system, the 
"proper names" of objects are represented by a sub- 
script of x, 

43 Hockett, Charles F, , A Course in Modern Linguistics , 

The MacMillan Company, New York, 1960. 

44 Such information includes semantic markers, distin- 
guishes in the Katz-Fodor sense, area of provenience 
information, and stylistic information. 

45 The list of English verbs - taken from Hornby, A.S., 

E.V. Gatenby and H, Wakefield, The Advanced Learner' s 
Dictionary of Current English , Second Edition, 

Oxford University Press , London, 1963 - will appear 
as an appendix to Lehmann-Stachowitz, 1971b, the list 
of German verbs in Lehmann-Stachowitz, 1971a, The 
lists are alphabetically arranged according to the 
following criteria: 'J' 

a) verbs which are both transitive and intransitive, 
h) verbs which are only transitive, and 
c) verbs which are only intransitive. 

Each list is subdivided into two parts : one with one- 
word .".'entries , the othef ^ with - entries consisting of more 
than one^word. The lists ofj English verbs which take 
prepositional objects ,; sorted alphabetically according 
to various criteria, has appeared as an appendix to 
Lehmann-Stachowitz 1970 , =f, 

46 The results': will be published as ft derivational dictionaries 
of German and English, sorted according to affixes and 




47 This look-up procedure is actually more efficient than 
generating a glossary of the text and analyzing each 
word only once. 

48 I would like to thank Bary Gold for calling my 
attention to this article and for discussing some of 
the technicalities and my conclusions with me. 





APPENDIX 



Lexicographic Work at the Linguistics Research Center 

Lexicographic work at the Center is performed in five 
stages : 

a) the copying of lexical material from dictionaries, 
such as Wildhagen, cf. footnote 27, and Hornby, cf. 
footnote 45, Information pertaining to dis tinguishers 
and area of provenience is copied as given in the dic- 
tionaries; 

b) the addition of syntactic and s emo- syntactic features 
to the obtained items according to the classification 
scheme given in the following pages; 

c) . the establishment of equivalence relations or inference 
relations between syntactic and/or s emo -syntactic features 
of all entries or large subsets of entries, (Features 
that can be predicted from the occurrence of other features 
need not occur in the dictionary; they can be introduced 

by means of redundancy rules during actual analysis) ; 

d) mechanical conversion of the established lists to the 
LRS dictionary format. 

e) conflation with the current LRS dictionaries which ;v\ 
contain for each item a subscript pertaining to ' para- 



digmatic information and, in the cases of allomorphs , a 
subscript with the; information on how to generate the 




lemma , German nouns contain gender information; all 
adjectives contain information about their attributive 
and/or predicative use. 

Stages a and b represent the descriptive phase; stage c, 
the interpretative phase. Lexicographic work on German 
and English adjectives, adverbs and nouns is in stage a, 
work on verbs and a subset of nouns in stage b. During 

stage c, we plan to introduce additional semantic features 

/ 

required because of the dis tinguishers associated with 
some lexical items, (Area of provenience information is 
handled as one of the accidental features of a lexical 
entry) , 

The following pages are a copy of the coding instructions 
for the LRC lexicographers. Note that some semo-syntactic 
features occur - to facilitate encoding - as syntactic 
features, cf, the subscript RL under nouns. During the 
conversion to LRS format, the features will . receive their 
"correct" interpretation. 




TY 

TS 



VERB FEATURES 



(VT, VR, VI, VTC NP , NG* e ) 

(HU, AL_, RL, IN, AB, PO, AN, BP, MS, CN, CO, NM, UN, 
QU, MA, E, F) 

FS (NP, it, TH, MI, FT e , GR £ , ICL, IMIg, II* ) 

DS g (G, D, A) 

OB (G g , D g , Ag , 0 E , all PREP's, TH, CL, MI, FT £ , GRg , 

ICL, IMI E , PAPL , Ilg, BC, CM, NC , NA, AC, I) 

TO (HU, AL, PL, IN, AB, PO, AN, BP, MS, CN, CO, NM, UN, 
QU, MA, E, P, R, RCC , IT ] 

RA (TIM, PNC, EXT, SIM, PRI , POST, LOG, DIR, ORN, MAN, 
MOD, CAUS, MSR , DEG, FRQ , PRB) 

OA (DOR)** 

Subscript Definitions : 



TY = type of verb 

TS = type of subject; always code one of the underlined 
values for TS and TO; code values without underline 
only if subject or object is restricted to that 
value 



\ 



r 




I 



1 




i: 

TY t 

../V 




FS = form of subject 

DS = deep subject; mark only if English translation is 
nominative, e.g. e.6 ^ 6, At yrit ch ; do not mark c.4 

Qzko e,At mtti 

OB = form of object; for 2 objects with +, the order is: 
O + PREP , 0 + CLS; PREP + PREP reverse order given 
in dictionary. English: Only one object: NP of 

refl. is not marked. Adjust G order to B order 

TO = type of object; code TO values even for object 
clauses and phrases 

RA = requires adverb ; e . g. pat RA(DIR) . He put the 
book on the table, but *He put the book. 

OA = optional . adverb j f ; y f f 




Value Definitions : 

VT = takes at least one object which is not a 
pronoun • : 

VR = takes, at least one object which must be a 
pronoun y'y .. - : V:A j 



re f 1 exive 









TY 



TS 



FS 



DS 



OB 



TO 



A 



VT,VR - takes at least two objects, one XArhich is 

reflexive and one which is not a reflexive 
pronoun 

j VI = intransitive 

1 VTC ■ takes a cognate object only; we define cognate 
object as the true cognate and all nouns sub- 
sumed under that term, e.g. elnen Tanz { WaZzzx, 
Rzgentanz) ianzen 
no passive 
no progressive 



NP ■ 



V 



NG = 



A 



HU, AL , etc. as defined for noun features 
E = entia (any noun, PO or AB) 

P - plural 



r NP = noun phrase; code only if another FS value is 
present 

it, es ; no TS information is required 
that - claus e 
marked infinitive 
for- to construction 
GR = gerund 

ICL = interrogative clause 

IMI = interrogative pronoun + marked infinitive 
II - interrogative pronoun + infinitive 



IT = 

TH - 
MI - 
FT - 



G s genitive 
D - dative 
A = accusative 



'0 = NP object 

Th , MI , etc. as defined above for FS 
CL - main claus e 
FAPL s « past participle 
I BC = takes be + NP or ADJ (ifiM) 

A CM = takes optional fee + NP or ADJ 

NA - takes NP or ADJ complement without be 
NC = takes NP complement without be ( eJteet.) 
AC = takes ••adj ective complement wi thout b e 
I “infinitive 



[HU , AL , etc. as .defined for noun features 
]E = entia,>: (any noun , PO or : 

VP » plural ' '.y~: 

R; ■' •/reflexive. • - v v" -- 
-RGC ^ recip'rocai^ gestagen) 



.t*. f 


















; r - V A . • a; a A a a-: " * Ay a. ; Ay .• • ;V.y ;• •••• •• .■ 

A ' . ‘ ;; A: •. y ". y yy : ; y --,A-A A ; '.‘ 




V. . : -V v: : V •• ti/vvi i A- A- i vV'A* A A ,rWi;:r A •/. :••••.• ‘ ..A? l\ - ; . * - •; =. \j'4=V:>. ••.'. >•.. ■ AV*. . •< 



RA 



OA 






< 



TIM = 


time 


PNC = 


punctual 


EXT = 


extensional 


SIM = 


simultaneous with 


PRI = 


prior to point of 


POST 


= later than point 


LOG = 


locat ion 


DIR = 


direction to 


ORN - 


direction from 


MAN = 


manner 


MOD = 


modality 


CAUS > 


= causality 


MSR = 


measure 


DEG = 


degree 


FRQ = 


frequency 



point of reference 
reference 
of reference 



{ PRB - degree of certainty 
{DOR - direction* or origin, i 



e. adverb of directionality 



Case ambiguity in 
Example: ANi, AN2 



German prepositions 



1 = acc 



2 » dat. 



* Subscript E: relevant for English verbs only 

Subscript G : relevant for German verbs only 



For the descriptors TS and TO, one of the underlined 
features must be coded for each verb; values without 
underline can be optionally added. 







TY 

OB 

TO 

TA 

SX 

RL 

DF 

FM 



NOUN FEATURES 

(HU, AL-, EL, IK, AB_, AN_, PCL MA, BP, MS, CN, CO, NM, 
UN ^ QU^) 

Call prepositions) 

(HU, AL , etc. ) 

(ZU, CL, TH, DIR) 

(MA, FE) 

(WO; WOII IN ; WARUM ; OB; WIE; ALS) 

(VT, VI, A) 

(A) 



Subscript Definitions : 

TY = type of noun 
OB * object 
TO - type of object 
TA ” takes attribute 
SX - sex 

RL = relative pronoun 
DF - derived from 
FM = form 



TY 



Value Definitions : 



< 



V 



f HU - human 
AL * animal 
PL ■ plant 
IN = inanimate 
AB « abstract 
AN = animate 
PO = physical object 

MA = machine which can perform human activities 
BP - body part 

g« 

milk, s and) 

CN ■ count 

CO = collective (components can be counted; can be used 
with cU.-ipe-'z.-* e [gsioup, ke.Jid, gave. A.nme,n£) ) 

NM a proper name 

UN = unit (ADV/ QU 1111 ; e g g Me ter , Jah r ) J WiSBWSSv- 2E 

QR " quantity ( > (of) NP„; e . g .-’group glass half , 

do z en s % ) 



MS - mass (homogenous, occurs without article in sg: 



■ v ; - : ;y y : ,yv 

In this set, one of the underlined values 






mm 



u 

1 values must be coded 



7 , • 

:: V- : V 

' V 






for each noun; values without underline are optionally 

added as appropriate. _ ; 



'-r ; v 






K-' 






ZU 


m 


zu-infinitive 


TA ■< 


CL 


« 


main clause 


TH 




that-clause 




DIR 


“ 


direction (e.g. Flucht nach Italian, zu den 
Indern) 


SX j 


r ma 


a 


male 


[FE 


— 


female 


J 


' VT 


m 


transitive verb 


DF 4 


VI 


sr 


intransitive verb 


1 




a 


adjective 


FM 


A 


- 


adjective (e.g. "Abtruennige (r) is coded as 
a noun: ABTRUENNIG TY(HU) FM (A) ) 



Compounds : 

BAUM + WOLL + FABRIKANT 




• •• ' . ■' . - • • • • •• ' . ••• . ; . , . >- ' . .■■■■■>■ ' v- ■■■ • ■■ 1 ■■■■ ■ - 

V ;• ,• _ ' . • . . . ' ; .. . 








ADJECTIVE FEATURES 



MD 


(HU, AL , PL , IN, AB, 


PO, 


AN, E, TH, PLU) 


FM 


(PRPL, PAPL) 






TY 


(MSR, TM) 






RA 


(TIM, PNC, DUR, PLC , 


LOG 


, DIR, ORN, MAN) 


OB 


(G, D, A, PREP’S) 






TO 


(HU, AL, PL, IN, AB, 


PO, 


AN, E) 



Subscript Definitions : 



MD 

FM 

TY 

RA 

OB 

TO 



the adjective modifies nouns of the specified type 
the adjective has the form of a participle 
type of adjective 

the adjective recjuires an adverb (e . g , wohnhaft} 
object 

type of object 



MB 



Value Definitions : 

HU, AL , etc. as defined for noun 
TH = that -clause 

^LU — plural noun or collective or mass noun 



FM 



PRPL = present participle 
PAPL = -past participle 



TY 

RA 



MSR = measurable (wide, old; e . g. five years old, 

five men strong ) • 

, v • — may undergo tough movement (hard, easy) 



TIM, PNC, etc. as defined for adverbs. 



Cg 









-V ■ . 



enitive 

- “V.V. .V V .* ••• -"** . V-. • : ' ** .. .Zri* *• ... , .. . 

A - ^ ac cusative ^ .* v >, ■■ 








TENTATIVE ADVERB FEATURES 



TY 



MD 



( TIM , PNC, EXT, SIM, PRI, POST, LOG, 
MAN , MOD , CAU5 , MSR , DEG , FRQ , PRB ) 
(A, AV, V, N, S) 



DIR, ORN , 



Subscript Definitions ; 



TY 

MD 



- type o£ adverb 
“ modifies 



Value Definitions : 



TY 



f TIM - time 
PNC = punctual 
EXT = ixtensional 

SIM - simultaneous with point of reference 

PRI = prior to point of reference 

POST - later than point of reference 

LOC = location 

DIR = direction to 

ORN = direction from 

MSN = manne r 

MOD = modality 

CAUS ■ causality 

MSR *• measure 

DEG "degree 

FRO - frequency 

PRB = degree of certainty 






In this set, one of the underlined values must be 
coded for each adverb; values without underline 
are optionally- added. /I'--..: 



A "Adjective 

AV =Ad.verb 

v = 

N Noun ^^. 4 ^-; 

S "Sentence ; 








Bib llograp hy 



Carnap, Rudolf. 1928. Her logische Aufbau dcr Welt. 
Weltkreis -Verlag, Berlin. — — 

Carnap, Rudolf. 1934. Log is chc Syntax der Sprache . Julius 
Springer Verlag, Wien . 



Chomsky, Noam. 1965. Aspects of t he Theory of Syntax. M.I.T. 
Press , Cambridge” Massachusetts . 



Frege, Gottlob. 1879. Begrlf fsschr if t in: From Frege to 
G6dei (ed. Jean van “ Hei j enoort) , Harvard University 
Press, Cambridge, Massachusetts, 1967. 

Harris, Zellig S. 1957. "Co-occurrence and Transformation 
in Linguistic Structure", Language, Vol. 33. No. 3. 

Also in: The Structure of Language (eds . J.A. Fodor 
and J.J. Katz), Prentice -Hall” Inc., Englewood Cliffs, 
New Jersey, 1964. 

Harris , Zellig S, 1962. String Analysis of Sentence Structure. 
Mouton § Co. , The Hague . 



Hairris , Zellig S. 1965. "Transformational Theory 
Vol. 41, No „ 3 . 




Harris, Zellig S. 1968. Mathemati cal Structures of Language . 
John Wiley § Sons, New York. " 

Kami ah, Wilhelm and Paul Lorenzen. 1967. Logische Propadeutik 
Oder Vorschule des vern ilnf tigen Redens “ Bibliograuhisches 
Ins t i tut ,‘Mannhe im . ° 



JCap 1 an , Ab r ah am . 1964 . The Conduct of Inquiry , Methodol o gy 
for Behavio r al S cience ." Chandler Publishing Company , 
San Ft an else o . — 



Katz, Jerrold J. and Jerry A. Fodor. 1963. "The Structure 
of a Semantic Theory^. Language . Voly- 39 . No. 2 ,.,,' 
Also in : The Structure of ^-Language . Readings in’ -the 

. PTvi 1 ftcrmTiw ^ 4 - ' T /.JTJ,. STITTS M _ I 8 — 








Katz, Jerrold J. 1966, The Philosophy of Language , Harper 
§ Row, New York. 

Lehmann, Winfred P. and Rolf Stachowitz. 1970, Res earch 
in German-English Machine Translation on Syntactic 
Level ,^¥ol . II , Linguistics Research Center, The 
University of Texas at Austin, Austin, Texas, 



Lehmann, Winfred P, and Rolf Stachowitz. 1971a, Deve lop - 
ment of German-English Machine Translation System , 
Linguistics Re s e arch Center , The Unive rs i ty of Texas 
at Austin, Austin, Texas. 

Lehmann, Winfred P. and Rolf Stachowitz, 1971b. Normalization 
of Natural Language for Information Retrieval , 
Linguistics Research Center, The Uni vers ity of Texas 
at Austin, Austin, Texas, 

McGawley, James D. 1968a, "The Role of Semantics in a 
Grammar”, in: Universals in Linguistic Theory, 

(eds, Emmon Bach ~ancT Robe rt T. Harms) , Holt , Rinehart 
and Winston, New York, 

McCawley, James D, 196 8b. "Concerning the Base Component 

of a Transformational Grammar". Foundations of Language , 
Vo 1 . 3, No. 3. 



Morris, Charles. 1946. Signs, Language and Behavior , New 
York, Prentice -Hall , Inc , 

Petrick, Stanley Roy, 1965. A Recognition Procedure for 

Transformational Grammars , PhiD. dissertation , M , I , T . 

Reichenbach, Hans, 1947. Elements of Symbolic Logic , 

Collier ^MacMillan Ltd., London. (FI rs t Free Press 
Paperback edition 1966) . 



Stachowitz, Rolf. 1970a. "The Construction of a Computerized 
Dictionary". Paper presented at the Modern Language 
Association Lexicographical Conference , Columbus y Ohio . 

Stachowitz , Rolf ,119 70b . "A Model for the Recognition and 

:bf Synonymous ■ : E^res sloiis';:With M 
DeebStructures". Paper n resented attheConference 




Tesniere, Lucien, 
Librairie C. 



revue et corrigee) 



1966 , Elements de Syntaxe Struc turale , 
i ©clc t Paris, ( deuxi erne ed.i t i on 



We inrex ch , Uriel, 1966, "Explorations 
Current T rends in Linguistics 3 . 
(ett. Thomas A. SebeoJcj, Mouton § 



in Semantic Theory". 
Theoretical Foundation s , 
Co , , The Hague . 










The Current Statue of Computer Hardware and Software as it Affects 
the Development of High Quality Machine Translation 



by 



D* Walker 

The Mitre Corporation 






The developments in computer hardware and software over the 
past ten years have gone a substantial way toward satisfying the needs 
specified in the early '60's as prerequisites for effective machine trans- 
lation programs. In particular, the storage capacities and processing 
speeds of current computers far exceed some of the stipulated require- 
ments established during that period. Increases in sophistication of 
programming systems have paralleled hardware developments as evidenced 
in operating systems like OS for the IBM 360 and 370 series and Tenex 
for the PDP-IO, to name only two. Compiler technology also has advanced 
markedly during the period, particularly as elaborations of the syntax- 
directed techniques introduced about ten years ago. Programming languages 
as well have increased in breadth, flexibility, and power, so that, although 
assembly language coding certainly still would reduce run— time, it no 
longer is a cost-effective alternative. As a result it seems reasonable 
to say that hardware and software considerations no longer constitute 



major obstacles to machine translation, at least according to strategies 
that are currently being pursued, y yy" T y'-y 




1960 T s. While hardware and software may not be obstacles, it is not 



clear that they have been used to full advantage. However, looking at 
the other side of the issue, it also is not clear that new approaches, 
particularly those motivated by the recent concerns with semantic pro- 
cessing, might not result in specifications for machine architecture or 
programming that cannot be met by existing. equipment and procedures. 

Whatever importance is assigned to these observations, ,it is 
clear in any case that the problems of mechanical translation at this 
stage are primarily of two kinds, linguistic and algorithmic. That is, 
the responsibility for establishing hardware and software requirements 
depends on the design specifications for a mechanical translation system. 
And these specifications entail a knowledge of the grammars of the language 
involved, a strategy for analyzing them, and a procedure for relating the 
analyses. Until we can resolve these matters satisfactorily, any pre- 
scriptions for hardware and software are purely speculative. 

In spite of these uncertainties, one class of computer capabilities 
should be stressed in this context both because of its potential use in the 
process of mechanical translation and because of ; the role it may play in 



grammar development and in the formulation of algorithms for linguistic 
analysis. I am referring.to interactive capabilities that allow for on-line 



access to the computer. Although it is only recently that such man-machine 

, - • t ‘ .... ■; * ' . .' ' ■ ' •• -■ -. \ S • ' * '■ ' - . - . > -• . . * O- 1 ; • • »- • ...Mv r ..;. .... • .. . ... : ■ ■ ■ » • - •••••• • '. : ' 




HE] 









® . ■ ■ : ■ : 






would fully automatic approaches . Again, I suspect that the problem 
here as before is the lack of understanding about grammar, linguistic 



analysis, and translation algorithms. However, there has been a 
substantial amount of work now with grammar testers and with systems 
for handling personal files, work that should be extended into the 
mechanical translation arena. 





I 





Equivalents and Explanations in Bilingual Dictionaries 





EQUIVALENTS AND EXPLANATIONS IN BILINGUAL DICTIONARIES , 1 

I 

f 

! L. Zgusta 

7 : ■ 

/ 

I 

■ 

t - ■ .... 

/ : ■ . . . 

7 '• r- - : . 

The task of the bilingual lexicographer is to find such 
lexical units in the target language as are equivalent to 
the lexical units of the source language, and to coordinate 
them. We call "lexical equivalent" a lexical unit of the 
target language which has the same lexical meaning as the 
respective lexical unit of the source language. The de- 
finitional requirement is that the identity should be 
absolute: the equivalent should have the same polysemy , 

the same stylistic value, etc. But such absolute equi- 
valents are rather rare, in. the majority of instances, 
the lexical meaning of the respective 'lexical unit of the 



target language corresponds only partly to that of its 
counterpart inathe source language . l I f we wish to be very 



prec ise , we therefore s p e ak ; about pa rtial, equivalents y but 
. ; ho’iTnally;>liweL:‘us'e::Ethef ; :.tefm" : l.requivaiehti?fi-kribwi : hgl^ 
majority are partial. 

r - - •• ” ' J • ’• • ’ ' :• -.V- . . , ' - l 



;• ?r‘ v‘v’'!- /:£: fHX ! i; V’ y ; . 



B e fore st ar ting the s earch for equivalent s, we mus t 
compare the structures of the .two languages in order to 




this principle too strictly. For instance, German, 

H andcLAb 2 ,s it, ( s ub st . ) has a good equivalent in English 
kand-uooAk (subst, ) , but if it is used as a label on 
wares, the English equivalent is hand-made, because 



the English substantive denotes only the process, not 
its results. Usually, there are not only such isolated 
points of trouble, but also discrepancies rooted in the 
system. It is easy to decide that English substantives 
and adjectives will be considered equivalent to Czech 
substantives and adjectives, and to indicate pairs like 
Czech ne.be, : English heaven, Czech nzbe.6ky (adj , ) : } 

English kzav&nty, in a Czech -English dic- 

tionary . But there will also be pairs like Cz. cfhf.a : 



Engh bAZck, ■ Cz. : ,<UkZdvS (adj . ) : Eng, bxlcfe (as in a 

bsilck . The second pair of equivalents can be lbft 

without comment if the Czech user of the dictionary is 
supposed to have a fair knowledge of English, If this is 
not true, the entry of the second pair should contain an 
indication of how the equivalent is construed, e.g, by giving 
an example (6a^cfe waff)’.. The example used here is easy to 
handle, but the real- life of g. the lexicographer poses More 



difficult proM^mif - of hiaytyjpe,. y The main thing seems to 
be to see these discrepancies before one begins the con- 
crete work and to decide on their solution in general, so 
th a t th e indivi dua 1 ins tances are tr e at c d in a unified way 
in. the whbie: 






The equivalent should be a real lexical unit of the 




saying that this can be shortened by using native 
speakers as informants, or by using one’s own competence; 
but at least some collections of contexts - not neces- 
sarily long ones - usually are essential:) The lexicographer 
then tries to translate all these typical contexts into the 
target language, using in each instance the prospective 
equivalent of the target language. If the prospective 
equivalent fits into all these contexts, it is an ab- 
solute one; if not, it is partial and the entry will 
have to indicate some other (partial) equivalent (s) to 
cover the whole range of the lexical meaning of the entry 
word. The way the lexicographer presents the data in the 
dictionary is largely governed by the purpose of the pre- 
pared dictionary. Let us discuss some examples.' 



German haZKatan , a 4, ah \j zAhaZ-ft,atan "to marry” are usually 
considered equivalents of Chinese xu jZa . One of the dif- 
ferences between them is that the Chinese lexical unit is 
used in reference to women only. ;;,ln a dictionary whose 
only purpose is to help native speakers of German to under- 
stand Chinese texts the entry could have the basic form 



xu JZa ;L'L’’tieirate^ vetheifaten’ 






l, two equivalents are applicable in all contexts, 

so that it is not necessary to state the restriction of 
the Chinese lexical unit; and the > German user needs no 
information about the German equivalents. But if the 

• ....... ... • • . - . ... -V’ = • ' ' ™ J - -- -- '. ■ - ..j' ... u ^ V = ^ V t ;■ .T V' ^ -Vf T. ■ ! »„■. ' 1 2 ’1 •' * A ‘ ’ ^ X-V ‘ ' I' .' *. 1 ^ ‘ '»• . • .4 








xu JZd (von Frauen) , - ”heiratfen;,-. si ch verheiraten” 



If, on the other hand, the dictionary is intended 
to help the Chinese user produce German texts, it is 
necessary to indicate the difference between the two 
German partial equivalents , so that the user can make 
the right choice: 



xu j A, a 



"heiraten" ("to take in marriage") , 
"sich verheiraten" ("to get married") 



(The English words in quotation marks symbolize an indi- 
cation which would have the form of a gloss or of an 
explanation, either in German or in Chinese, in a real 



) 



combination of the intentions mentioned requires , 



then, an entry of a form like 



xu j\La. 



marriage") , 
get married") 



("to take in 
"sich ve rheiraten"; ( "to 



Another type of entry can be discussed with the help of 

be considered the German equivalent of Chinese xlanue. 
The Chinese contexts are roughly of . the type : He nearly 

s tumbled , fell , starved, died , knocked s omeone down , 




In a Chinese-English dictionary, the entry could have the 
form 



xtan ue (referring to negative events) "almost, 
nearly" 

The applicational restriction could be stated in the form 
of an example or of some examples; the advantage of this 
method of presentation is that the information is more 
immediate, and, additionally, that it is less explicit 
than the gloss . 



Let us now consider the English equivalents. .They both 
have multiple meaning. If we accept Hornby's description 
of their meaning, we see that almost has two senses , viz . : 

(1) as in He a.Zmo-6-t fizZZ [aZmoAt is replaceable by no.cin.ty ), 

(2) as in A Z m o .4 t yio onz bzZZzvzd Hza. ( ctZin o&t is not replace- 

able by nzaA.Zy') . The other equivalent , wea^fx/, has (accord- 
ing to/ Hornby again) three senses, viz. : (1) as in It ^ 

JizakZy 1 o ' aZoak (replaceable by , (2) as in Zkavz 

$20 , but that wtZZ not bz nz(X.A.Zy enough fioA. my jouA-Uzy (not 
replaceable), (3) asinnzaAZyAzZatzdC^oZ^QpxUceable). 






If we quote aZmb&t, nzaA-ZytogetheT as equivalents of 
the- Chinese lexical unit , they disambiguate each other, 
because every user will assume that only that sense applies 
which is common to both of them. ' - 

■ - 1 ✓ * ■ : -• ‘ * , • 1 •» ’ ? '» - r ' f *. 

:• • .. _ ■ ■? • v ■ ;; •• t \Vy . V -V. -t. - T ; ! V ^ C -C;! W ' ZZ; .i}.' U;/ "V.:. • ‘ ' t'''" J' "V •" ' /- •" 

On the other hand, if we consider the German equivalents 

•/ . : .V: -t '" 1 * :■* \ i* fcyn i’rAiSr?: £fc*nr.*« tJL* & :r:fit£rir-ri\ L t J vs? ? f Art i±r icd -fv> a cLrhrs.r u < 

D'j_w_nh riVl:: -hTi aVM-f-T 




. -t? j: 

-— -**«*- ™ ..Sgfigse 



target language helps the user to find various expressions 
he can use , if only for stylistic variation. And second, 
imperceptible as the difference is, there usually is some 
slight difference between the meaning of even such close 
synonyms, so that if both are indicated, the information 
is richer and the user is inspired to imagine yet other 
possible translations and synonyms. But in any event , 
even a large dictionary should not indicate too many 
synonyms of this type, and a small one can omit them. 



•In sum, we have discussed three types of indication 
of partial equivalents and synonyms : 



(1) het/Latew; : a rule (semantic or 
grammatical) of the target language makes it predictable 
which of the two will be used; 



a.Zmo 4 >£ , Yi<LcUi-ty ' both can be used, but only in 
those senses of their multiple meaning which overlap; 



( 3 ) b o,Mn<xh e. , cus £. : either can be used, and the two 
taken together make the informat ion somehow richer. 



Although there are many borderline cases between these t r;:~' \ 

types, it is useful to know them ; . but' it is above . all types 
(2) and (3) which are difficulttodistinguish. In type (1) , 
it is preferable to put a semicolon between the two partial 
equivalents ; in types (2) and (3) , a comma is generally used. 






' -• ‘ - ' ■’ ,,, • •; • .. -I \ i y . ’ J - ‘ 4 • •’ . -- • 

. ' - r ■ -- - ^ » •' ■ ■ * . . ‘ . 

Another. type, of problem can be, illustrated, by. the follow- . 







an old malady recurs , old society, old ideology, old 
dwelling, old job; £2) old method, old custom, old 
dream, old archive; £3) old equipment in industry, old 
material, old clothes." Unless the dictionary belongs 
to the smallest type, without any generative power in 
the target language, it will not be sufficient to state 
simply j£n : aZ£, but it will be felt necessary to give 
richer indications. It will also be essential to indicate 
that the German equivalent must not be taken in one of its 
senses as in Ea fit: 10 Jak/te, aZ£ "He is ten years old". 

If the dictionary proceeds , as usual, by the indication 
of synonyms, one can suppose an entry of approximately 
the following type: ••• . 



jZn ■ : £1) "ait, fruher, ehemalig" .(that is, say, 

"old, former, previous") ; 

£2) "alt, s chon lange bestehend" £"old, 
existing for a long time") ; 

£3) "alt, gebraucht. , durch langen Gebr auch 

abgemachtt* £"old, worn out by long use"). 



When we, consider these indications, we seethat anequi- 
valent like; aZt, "old"rUnd^ unit which can 

be immediately inserted into a German, sentence , whereas 4 cA on 
Zang e. b && £&he,nd lory dusick tang Ge.bAa.uah abge.uia.c.k£ are some - 
how felt as non-minimal , as expansions of what, the simple a.Z£ 
can convey. But these. . non-minimal. expans ions hav.e the ad- 

V3TI rr<a l-Tl ± fhAV TArKplr +- T-I t tV n c h T o +■ -1 rri Vrc» -mr^ rb 



vantage that they ,. when, we see; them in , isolation, ..give, more , . , , ,, 

information about: the lexical meaning of . the; ^source ‘ language . 

' _ ! . ... / ' -• .. ’ \ ’• \ • f . ■ ’ ■ 

Equivalents of the first tvue are usiiallv called translational 







*,r r" *v;.r •- i f: : spri jin*-- r.5i r.a-~s f .f • ; : k&?iz taiVr rj= 4 




great descriptive power. 



Very frequently, it is necessary to give a translational 
equivalent and an explanatory one , or only an explanatory 
one^ For instance, an English-French dictionary can hardly 
proceed by giving a simple equivalent of English boyhood, 
because there is no really good one. The explanatory equi- 
valent would probably be something like itat dd gaKpori. 

But this cannot be inserted into sentences (or translation 
o£ sentences) like I n hd.6 boyhood, h e , . , , A more trans- 
lational equivalent like adolescence or is indi- 

cated. But these words are not restricted to male children 
in French, as the English word is. And so the entry would 
probably have to make a compromise and indicate, say. 



boyhood 



peri ode de je unease" 



un 



The explanatory equivalent has the advantage of being 

very general, because it is situated rather(on the notional 

than on the purely linguistic level .|g lf the|us er grasps 

what is indicated, and if he knows^ E he will be i V 

. able to understand — many, different. English ' sentences , and-he - - 

1 free to adapt his French translations as need be. 




. . . _ . _ , 

. 



used. But apart from the fact that it frequently conveys 
less information, the translational equivalent can cause 
a good deal of trouble to the lexicographer. Let us dis- 



cuss an example, We said that Chinese jin has a good 
equivalent in German alt "old". The subsequent discussion 
has shown that the lexicographer will probably feel it 
necessary to add some further equivalents. This can be 
pushed too far. For instance, the lexicographer may find 
Chinese contexts in which the best translation would be 
"preceding, foregone, past, obsolete"; there will be con- 
texts in which "ancient, antique, archaic" seem to fit well, 
etc. But to indicate all this would mean that the bilingual 
dictionary would grow into a synonymic dictionary of the 
target language. The lexicographer’s task is to indicate the 



most general translational equivalents which have a broad 
range of application. And so the explanatory equivalent and 
the translational one are not so much opposed as one would 
think . they both act ^.as representatives * of groups of synonyms 
and near- synonyms , out of which the user may choose the most 
suitable on® (if he knows them, or if he is able to us e a 



monoglott or a s ynonymi c di ct ionary of the target language) . 

The difference betWeenthetwo types is that the translational ' 
equivalent is always a possible choice for application in a 
s c n t eh ce and some times the best one. ' 




0 %$ 



The so-called culture-bound words pose another problem, 
because they frequently have no lexical equivalent in the 
target language. There are basically three types of solu- 
tion: (a) The lexicographer may try to create a trans- 

lational equivalent by borrowing the respective word into 
the target language, frequently in a phonemically adapted 
form. (b) He may try to create a translational equivalent 
by coining a loan-translation, or by coining a new ex- 
pression in the target language. (c) He may try to find 
an explanatory equivalent in the target language (wi th the 
eventual hope that it may become a translational one, if 
used frequently in future).. If we take examples from a 
less known language, the three types are: 

(a) Ossetic aZ am Eng, "alam" (borrowing) 

(b) Oss, lKoyi\JCLvida.g : "Ossetic way" (new coinage by ioan- 



(c) Oss. zftu 



'collective help" (explanatory equivalent) 



bi is clear that the explanatory equivalent (c) gives the 
richest information; types (a) and (b)^:can be chosen only if 
it is expected that the respective words will have a high 
frequency in translatedtexts (where th ere will be exp 1 ana - 
to ry notes, etc. ) • But fo r a re al , under stand in cr T«r«=> n on<i on 



tory notes,; etc.) . But for a real understanding, we need an 
explanation in all three types, for instance: 

(a) a£.am : "alam" (fruit and candy bound on a twig 




It depends on the lexicographer’s decision (and this, in 
its turn, on the type and purpose of the dictionary), whether 
his explanations will be minimal (as here, type b) , or 
whether they will verge on the encyclopedic types a, c) ; 
but they should have a uniform style through the whole dic- 
tionary. 



The difference between what we call an explanatory (or 
descriptive) equivalent and an explanation is that the ex- 
planatory equivalent tends to be similar to a translational 
equivalent. If stabilized and accepted into the language, 
it can become a lexical unit of the target language. But 



an explanation tends to be very similar to a lexicographic 



(or is even identical with it) as used in mono- 
glott dictionaries, and usually cannot aspire to becoming a 






lexical unit. But there is no need, 1 think, to stress that 



there are a great number of borderline cases . 









And so we see that the bilingual lexicographer works 
basically with translational equivalents , synonyms y mutually 
disambiguatingsynonyms , mutuallycqmplementingsynonyms , 
explanatory equiyaients ,: and explanations . ^AilKofaithem; have 
the purpose of informing the uset about the meaning of the 
lexical unit of the source language, of supplying him with 
lexical units of the target language which can be used in 
source-language sentences,, and of inducing in him a recol- 
lection of other suitable, near- synonymic lexical units of 
* the source language even if they are not directly indicated. 

• - V,l - 1 ' ' ... ‘ ' . . '' 

. ; v:.' v ; -n\» ; r. / >:r. - ' -X.‘ r ' ' j : 7 ■'> 5 >'r- •••:_.• . ' - “ v- ■’ ■ i'-’-v 

A crnnri pnf rv n-f ^ h *1 1 1 Ti mifl 1 A -5 r + l r»n 'srwr .... o 1 c r* -n ntzAc? -1 n _ 




Mtim n ‘iiiirtrTT m (>i i R 



FOOTNOTE 
1 



This article is based on a section of my Manual o 
Le,xA,c,ogftaphy (forthcoming) , I wrote that book in co- 
operation with several colleagues who supplied material 
and examples from various languages. Full acknowledge- 
ment of those examples will be found in the book itself. 



• . . 



• • 



‘ • . .••• I-;.:* . l - v . ,! - z J*v£ ?"• 

- L - • • • > ' : ■ £ -VZ'; \o>; « • v 









Z 



rJVj a 

J-Vx: 






• ' “• 1 •' •• l ’ •• ’ '•u 7 '-'-'' ' ‘.-"w - V = : .r’ % 1 : "c* != V - SV.v > ! ^ V-"" r •V‘. 1 ■' ’ 

v • •; •• • • • •• • ••• • ••..• . ' ^ ^ zz. zv: ' ':'■■■ •••:•.•; • tv - v : w: : • ; - :z _ 

V V~-. -• - ’> ^v'": ■ «*=■•;• >-«„• .rw. -r.. ; »;T.--V-«=jr;'-*‘*~V.5' :«? .? :*fZ7;.-iv.V ‘' f V*‘ 5 V ' '< ? =*'• •• ; v . ? L«-’ 

: \-CV •••T; r --^;;>:v-Vv,:;,v- : : ^ v ..Vi-:-;': •; •- • v •.: • '7„ .. " ••• •' v • 



.Z.., .::Z. 






v^’ 1 1'.v*. ** h - ‘ V. -:V r ^-;T : $‘ T y ■ 7i\ } r 



1 ' Z ■ ' >■ V:'..:.: - r- : .* ' . : 











The Shape of the Dictionary 
For Mechanical Translation Purposes 

by 

L - Z gust a 






\ 



1HE SHAPE OF THE DICTIONARY 



FOR MECHANICAL TRANSLATION PURPOSES. 1 



L. Zgusta 






A dictionary’ of the type we have in mind here 'should 
contain the lexical units; of the source language, selected 



to the needs of the type of texts to be trans - 
lated. Lexical equivalents of the target language should 
be coordinated with these lexical units in such a way that 
the choice is as precise an^ass automatic as poss ible . 
Great difficulties are caused ih this task not only by the 



and homonymy of the lexical units of the source 



V 



language, but also by .the factjthVt the equivalents usually 



cannot be coordinated In a one -to -oneway . We call .''lexical 






equivalent" a lexical unit of thettarget language which has 
the same lexical me aning as the respective lexical unit of 
the source language ; that means the equivalent should have 
the same\ polysemy, the same stylistic value , etc . , as the 



lexical unit of the source language. However, this is sel- 
dom th e C. as e**_ a n d . rnih '^Prii i pnt 1 vi TnnT^ fVi 



dom the case, and, consequently:, more than one equivalent 
is often needed to cover the lexical meaning, of the source 
word. We should, then, make the, distinction between abso- 
lute equivalents, which comply with the' definitional re- 

n i irom on t o - oV» r* •_ ^ 4 ^ •••:• 1 Vr-i -i .V ••• 



mm m 'mm . : • . K . ........... ..7 ' ‘ / -A-,, j • . I ^ ■ _,.•■■■ ■ 

Work on this ,■ art lcle-was performed at 1 ; the (Linguistics' 



speak about equivalents when it is usually the partial 



ones we have in mind. 



The present article -is not primarily concerned with 
the problem o£ (partial) equivalents , their choice, their 

mutual disambiguation and the delimitation o£ their appli- 

2 . . / 

cability in an entry. This article con ^ ant rates on prob- 
lems of choice from among more than one (partial) equi- 



valent within the entry of a lexical unit of the source 



/ 1 .... 



language. The point of view taken here is that, on the 



one hand, the more we can rely on si 



n 



formal indica- 



tions of the source language the better; but that, on the 



other hand, such simple formal indications do not always 



exist; and that one of the cardinal difficulties with which 



we have to cope is that the selection /of a suitable (par- 
tial) equivalent is to be made by .an ^agent which is by far 



less imaginative than the human mind,' 

Semantic difference in the s< 
fore, the necessity of a cert air 
equivalents) : ^is frequently|in^c 
• form. Th e s i t u a 1 1 on-Vi s r a therms 















Since we envisage, for the moment s only basic trans- 
lational needs , this form of the entry should suffice to 
guarantee a good selection of the equivalent in sentences 
liJce German tn dam WaZd gakcn - English to MaZk tn the. 
Mood, and German tn dan WaZd gakan - English to MaZk -into 
the. Mood, that is, given the ability to recognize which 
German substantive is governed by tn and whether it is 
dative or accusative. 



The example just discussed is one of the simplest ones. 
It can be said that the recognition of semantic difference 
and the choice of the equivalent entailed by it are not 
difficult if the semantic difference is indicated by a 
clear morphological difference. 



The formal distinction , however , does not necessarily 
have to be a morphological one; the main thing is that the 
distinction should be clear in itself and non -ambiguous . 

For instance, it shouldibeieasy to discern theyjpolysemy of 
German kandgH.aZ{ s ZJLck , because in one of its senses , it is 
used exclus ively with the forms of warden ; handgA-at^Ztck 
warden "to use phys ical force". In its other sense,it is 
used with mack any - 4 at ny and f ya- ( f e w:;: ot he r ve rbs: handgJiaZfi Zt ati 
ma. ah an "to make available" ,- kandgn.aZ4Ztck ■ a ctn "to be avail 

able". . • .■ - ^ - ■ : . - • 

‘ ' ' ’ " ‘ " ’ ” " *" 



?: 'V-v-V': ■ " ■ ! y-yy-'X ./'y-y Vy'V:" "'■i- ■ ; ' ! iv i ■; 

: Perhaps more complicated . is . the following type of. case. 



I f we : s imp lify the 











(As in German den Strom ablelten - English to lead .the, 
current away, German din Adjektlve au* den Substantlven 
ablelten, English to derive adjective* firom * ub* tantlv e* , ) 
It would seem that it should not be too difficult to dis- 
tinguish the two types of rections quite automatically, 
and make the choice accordingly. The next example will, 
however, be more complicated. 



The simplest way to construct the entry of German 
beraten seems to be 



G@Tms.Ti b er at en j eman den 
b en.at.en etwa * 
b eraten neb ex. etcva * 



Eng, to advise 

to deliberate ( upon) 
to deliberate (upon) 



The last two German rections are different in their grammatical 
form, but' there is no semantic difference. On the other hand, 
there is no grammatical difference between the two first rec- 



tions , but there is a semantic difference which entails a 



different choice of English equivalent .The abstract ex- 
pression of the two rections in the lexicographic entry 
( J emanden : : etwa* ) is rather simple , arid no human user of 
a dictionary could have difficulty with /it* .Vr-‘ Still , ’ for the 
purpose of automatic recognition and choice the presence of 
this entry in the dictionary entails the necessity of indicating / 



m 




S Itua Lion i -h v.-^rr-.v r; ^ h Vh .. frT ”■ ' 

' . . 

- - ‘ < ^ ' ■ • ' 




m&m 






German abhaZten Cl) j emanden von Atak 



to kotd ofifi 



(2) j emanden von etuiaA •+ to ktndex., 

[S') etivaA -*■ (a) to keep out 

(b) to koZd 



We see that within one rection, (3) , there are two choices 
(a) , (b) which are semantically governed: (a) is chosen if 

the ob j ect (represented by etwaA ) is , e , g , , itaAA et, NaeAA e , 
ZugZufit t Re.ge.ni 0 ?) is chosen if the object is , e . g . , 
Sttzung , (n/ahZen, Geateht, etc, 



Another example of this type is German auAtegen. One of 
its rections (the most frequent one) is auAtegen etuoaA . The 
respective part of the entry would have to have a form 
similar to the following one: 



auAtegen (1) eiwtii (a) [Zm Lade.nfie.nA ten,] to dtApZay 

(b) [Geldl Z to pay (pJvovZ& to natty 

(c) [ Texte ] •*- to ZnteKpH.et 



In a case . like- this , the really -impor tant.iindication is the one 
contained in brackets.. And as every lexicographer knows , to 



c6hsj;fuct those Restrictive (or oemahtip)^;gli^ 









situation envisaged in this article we try to count with 
an automatic choice from among the equivalents, and this 
causes much trouble. The reason is that every human user 
of a dictionary will immediately understand that an indi- 
cation like [Zm Lad^n^e.niie.A,] is simply an example since 
goods can be displayed also on stands within a shop or in 
the market, and so on. Not only that; the human user will 
also understand that the restrictive gloss llm Lad&n£en4£&X] 
is, at the same time, the representative of a certain type 
of situation, since one can speak about somebody displaying 
his goods without mentioning where and how, and choice (a) 
is then entailed. Therefore, this part of the entry could 
also have the following form: 



cLU4>Zzge,n (1) z£ma.& (a) fWaAzn] to display 



This restrictive gloss would have other difficulties of its 
own. We mention it to show that restrictive glosses have to 
be chosen from among various possibilities inherent in the 
facts of language . 



In the same way , [Ge£d] is both an example and the re 

. -•---■ - ■ ■* • - i w *. v -- :r * *. 'f ■>-- . * " p *'!-..--" r * •'*•« * • ■ ■ 

presentative of a class o£ synonyms , near - synonynis , and 
semantically related words Summe, . In (c) , 








The difficulty of this problem is obvious. One of the 
easiest answers would be that we should increase the num- 
ber of concrete examples quoted in the restrictive glosses. 
For instance, one could imagine the following form for the 
entry quoted above: 



abhaZtzn [3) zZwa.* (a) 



[Wcu a <LK t NazAAe. fZuztAZgkzZZzn, 
Reg en. Hag zZ, Wtnd, ZugZutfZ] ■+ 

’ . ■■ ■- ' to kzzp out 



The increase in the number of concrete examples in the 
restrictive glosses would be ah enormous gain; but we should 
count with dozens and perhaps hundreds of them in one gloss. 



It does not seem to me, however, that the more or less ex- 



haustive enumeration of examples could be a real solution. 

Let us discuss the following example. That part of the entry 
of German V zsij uzngzn which is concerned with technical termi- 
nology could have the following form: 



vzajazngzn 



ztu) a-6 ( a) [ M a $ ab ] 

(b) [in biology] 



*»• to tzdu ce 

f _ ' . -. r \* 

> to tzj uvznate 



The: restrictive g lo s s pertinen t to C could be exp ande d by 
an enumeration of examples. licannot, however , s ee that ' 



choice (b). could be ,governeh f by the ,;indf cat ion, of tconcretev 
examples . First, because the areaof objects of rejuvenation, 
attempted or real , is rather vast; still, one can imagine a 
restrictive gloss' with oerhaos hnndres ’of examoles. e.p. 







. , . Rzg znzaatlo n-i> fiazhZgkztt, . . .tic. ] . But the s econd difficulty 
seems to be more wave . The area Of ohi ect s of re iu venation is 










purpose of science is to render it more vast. Consequently, 
one must take into consideration that after we have estab- 
lished our set of examples in the restrictive gloss, there 
will be biological texts reporting new investigations , dis- 
coveries , etc., concerned with new obj ects not stated among 
our examples; which would make a correct choice of the equi- 
valent impossible. And since the main purpose of machine 
translation is to translate recent reports on new discoveries , 
etc. quickly, we can conclude that the choice of the equivalent 
cannot be based on an exhaustive enumeration of contextual 
examples (understood as key words) , lest we block our way 
to the very goal we are trying to reach. 



It seems that what is needed is a classification of all 
entry-words selected for the future dictionary into classes 
constituted by the restrictive glosses and the semantic 



criteria contained in them , For ins tance: , since the cor- 
rect choice of an equivalent in some entries depends on 
whether the object is a person or hot this category should 
b® indicated in the 1 e mm a of ea ch substantival entry-word; 
since a correct choice in ano the r;ie n try depends on whether 
the context is, -a biological on e o r not , the pertinent indi- 
cation should be. a part of the lemma of th e respective * entry - 
words . This should be done with all the: res trictive glosses 



, involved in the corpus of entries. It would, require further 
researches . biit -i i- <=>#3 ttiq f-i-i i t- -t-to AC j; — . . 



possible, the form of hierarchically higher notions 

or if they indicated terminological, areas (such 

as- "hi nl n erv M ' ' <~>i mm -5 c- -t -**->>• » » ». + /- . S ■■ 1 i. ^ i jii_ :: 



In this way , though the 




/ 

) 

f 



j applicable not to a broader semantic range of texts but at 

■; 1 least to a much larger corpus of them than that on which 

•' . _ . . . ' . ' 

pi the original investigations were based. 

1'. i ■ - • • • • - ■ '• ■ 

% What has been discussed up to now is certainly no panacea/ 

r ■ •••■ ; ' . ■ ■ • - 

§-. There will be bases which will resist a generalization. For 

| instance, another rection of German ausZegen (not mentioned * 

above) is a.u4l.eg e.n (2) e.tu)a.A mtt e.£u)cu\ In German, contexts 
characterized by this formal feature are not only clearly 
differentiated from the contexts of the type au-6 teg ew. (1) 
e.tica.6 , but they also form a unified group, with a unified 
if general meaning. But there is no general equivalent in 
English, and the choice of the partial ones is governed by 
the ob j ect of the action. Consequently , we have to imagine 
that this part of the entry could have a form similar to the 
following one : *'■ y;?.:: 

au6&&ge,n (2) z£vocl& rt\JL£ <l£wcl& (a) [Tepptche] * to coy&Jt 

wttk f itaxp et4),— 
1 . to casipzt. 

(b) [Zemen-t] • to ttne. ivtth. . - ; 

- > ; ' .y'-y-": - ///" cement) v y'?. 

PYlCJlLLAt H'tth 





have stated above that it is relatively easy to find a solution 
for those cases in which a difference in meaning is indicated 
by a difference in form, preferably in morphology. The dic- 
tionary can make use of such differences.. Sometimes, the mor- 
phological distinction alone is sufficient to indicate the 
in meaning. For instance, the series of German 
dei, dzu, die Pi a. et. , die,, dei, den , die. Piaeien 



can be seen as a normal paradigm of a feminine substantive 



is , however, the semantic difference that the forms 



the singular require the English equivalent M 
men", whereas those of the plural require the 
"daily allowances " . Such a situation is easy to s olve 
Probably every lexicographer will take Pinei "diet" as 
singulare t an t urn , and Piac-tcw. ’’daily allowances 
word, a plurale tan turn; and such a solution is undoubtedly 
even more practical for an automatic 



But not all cases are as beautifully clear-cut as this . 
A morphological difference is sometimes of only partial 
value . For ins tance , if we try to find an English equi- 
valent for German:; Ojl£, in its application as a technical 
term > we Mn ^ " 



OJtJt ~ (1) [in geograp hy] 

geome try] 



pZd.ee. 

£ocu6 






It is usually maintained that the two are sufficiently dif- 
ferentiated by the fact . that OKt (1) [geogr.] has the. plural 
date, whereas OAt (2) [geom.) has the plural Oeaiea.. This . : 

morphological distinction is fully sufficient for the plural • 





The same situation can be observed in dte Matte.*., plural 
Mactte* "mother"; die. Matte.*., plural Mutt etui "nut": the 
singular is not morphologically differentiated. On the 
contrary da.6 EAk&nntnt - 6 "decisions judgment, sentence" and 
dte E4,kenntntA "comprehension, perception, cognition" are 
well differentiated in the singular, but since they have 
identical forms in the plural, die EKkenntnl^&e, they should 
be semantically differentiated as a juristic, and a psycho- 
logical and philosophical term , respectively. 



Cases like those just discussed are particularly dis- 
agreeable if there is a semantic difference only in a small 
part of a paradigm. Let us discuss an example. We can 
imagine a strongly reduced form of the entry of German 
eKtedtgen. as follows: 



2.H.I edtg tn (1) etwat 

(2) j&mande.n 



to fitntAh, aKfiCLYige, , h zttJLe 
: to ydt&'poi e o & \ 



Tllis German verb has the^ normal parti cip JLe zzlzdlgt which has 
the: same polysemy: Va& m^teklzdlgt "Thatls settled"^ Vufizk 

dtznazzhstz Saeubeiung uitsidzK zX-lzdtgt (wevtdcn) O'He wxll be 



disposed -df by the lext purge". Thisform, howeve r , has 








Gases like this are rather treacherous. Dictionaries 



are normally built on the principle that the form of the 



source language in which the entry-word is indicated and 
to which the equivalent is coordinated (the so-called 
canonical form) is a representative of the whole paradigm 
of the entry-word, that is, if the source language hap- 
pens to be a language with paradigms. Therefore, before 
the inclusion of a word, with its equivalent (s) , into the 
dictionary, its whole paradigm should be checked, and the 
more important semantic peculiarities of its single forms 
should be duly noted. 

If polysemy needs semantic differentiation by the con- 
text, we can expect that the same will be true of homonymy 
(overlapping as the two notions are) . The situation is 
basically identical, so there is no need to discuss special 
examples. There is, however, a special type of situation, 
in which a homonymous pair or polysemous meanings are dif- 
ferentiated by the form. German A facade generally has the 
meaning of "understanding, agreement"; but the set expres- 
sion Zri Ab/iede 6tellen means "to disavow, to dispute". 

This expression being rather frequent, the reduced entry 
could h ave a form; like : v. r/I-.- -r_ . ‘ ..J"}. ' 



A bAe.de. (1) 



undeA* ZcLndtng , a3A.aeme.nt 



( 2 ) tn AbJte.de. -6.tef.Xtn -*■ to dt& avow , to dZ&pute' 
i^iThiSvC-Vb lis to a topic which I shallimehtibn only 



' •. v. 1 • 5 •. \?j. ■■■■ •••....•;* -y . fv. vV- .vv: 










namely ;the fact that the re are comb inati oils 

•• =• • ••••• • • •• ~ 

. set which, have; a unified meaning , ; and whi 



?y;. i i aps with the 



typ e o f k and g A eti-ttak l MeJi.de n as 



■ ■ . ’ - - v •' ; :• •' - ■ '■ 

- ■ ' ■ 

..... 



, ;.vV 



■ - 

* • . ■ * ■ ■ ■ - • - ; 



8 



function as a lexical unit of a language. There are 
many various types of them. A dictionary of the type 
under consideration here, prepared for coping with texts 
of a limited range only, will hardly select many color- 
ful idioms such as Va.6 Ha^znpantzn. zngn.ztizn "To fly away". 
But it will have to list frequently occurring set expres- 
sions like tn Abnzdz AtzZZzn, particularly when their 
meaning is not predictable from that of their individual 



parts. Also, a dictionary of our type will probably select 
many technical terms which consist of more than one word. 

The technical terminology of any science gives many examples 
of the type Zztzkiz lnia.nt.zn.Zz , AzkwzJiz lnian.tzn.tz f etc. 

The situation in German is particularly easy, because a 
large number of such terminological coinages have the form 
of compound word, cf. VampiZokomottvz "steam engine". 

Still, there is no predictable regularity in this, cf. 



Sauznlznaut "pickled cabbage", but • 

4 aunz Gankzn "pickled cucumber", j 

. 

so that the lexicographer has to check the whole semantic 
area carefully, /It will .also be necessary to have the pro- 
ductive parts of compound words listed in the dictionary 
as entries of their own if they have a regular effect on . I 

the meaning of the whole compound. With real compound 
words, this is not too frequently the case, but affixes 
and, elements which approach theyst a tus of, affixes can-be i 

treated this way .y .For instance,* German an . - J < "p r o to - " ; 
pAZtido- +, English "ps eudo - ff , etc, Such y an indication has 



the big advantage that it is. so to say productive : it can 
take care of newly coined expressions (assuming they are 

coined regularly) , unknown at the moment of the compile- 

■ \ ■ ’ ' - . . - - . 

g 



y'y^-V 



tion of the dictionary. 



There are some points which may deserve to be men- 
tioned, Many a dictionary tends to forget that we find 
multiword lexical units not only among the denotative 
words. But the inclusion into the dictionary of expres- 
sions like German ab and zu "from time to time"* or 
German aufi und ab "up and down" is useful. And again, 
we will have to put into the dictionary indications of 
how to discern polysemy. Consider the difference between 
German von (kauta, nun , jatzt, gaAta/in, etc.) ab, English 
"from (today, now, yesterday, etc.) onwards", and German 
v om Bahnho ft ab (g ah. t dig. StAa$a bax.gab') "the. street begins 
to go downhill at the station". Therefore, a strongly re- 
duced part of the entry should have the form: 



von 



ab (1) [Zattangaben] 
(2) [OktAangaban] 



&Jiom , onwan.dA 
at, b ay and 



A particularly obnoxious type of set expressions are 
those which allow a certain variation. For instance, German 
(.tut, Achadat, maakf) ntahtA has a good English equivalent 
in "it does not matter". It would seem that there is no 
complication in this. Let us, however, consider the following 
sentences : Ea tut ntahtA . "It does not matter". EA. tut 

ntahtA . "He is doing nothing". This shows us that a set ex- 
pression may have p art s^ which allow; some variation, butiagain, 
it has parts that do not . Therefore , a good dictionary will 
have to contain indications of theifol lowing type : 



■ •■■■,; "•••• -.V ... ■ ‘ : ' . ' • . . • ■ • 5 ;' v -. ;y ] ■- • . . ' . • . ;• - • • ' •• . ' • .' .. '■ • .... r; 

<la (tut, Achadat, macht ) (ntahtA , wantg) 

" ' , “ - ‘ ' 

• * h ttdoaA notmattaA. 

■ : • . 

■■ • • ••• ^ -V •• -• - • - . ••• - • •••- • 

• • . • • •••••• : • ; • . • ■ ...... . . • ... . , .. ! .... • .... ... ... • -.v, ..* . ; ..... ./ 



It can be said, that the most difficult problem will 
be how to guarantee that an automatic device will make 
the correct choice from among the partial equivalents 
of the target language. This task is so difficult in 
itself that we should not make It even more difficult 
by indicating too many (partial) equivalents of the 
target language. Let us consider some entries discussed 
above. An entry of the type 

auAZug e.n (2) ctwaA mtt ctvoaA 

(a) [Te.ppU.che,] -> "to cove.si u)A.th [casipci*) , 

to carpet" 

(c) [Ettfe.nhe.Un] -*■ "to Untay , cnchLUt voUth 

[UvoKy ) " 



does not strike us as unusual. The verbs to Untay and 
to encsiuAt are synonymous for all practical purposes. 
Every human user of a dictionary is accustomed to under- 
standing an indication like this V so that he is free to 
use either one or another synonym. 



- .. ... ... # ... . . . .. ... .^.r .• . . ... 

On the other hand, if we take a part of the entry of 

eAtcdUg e.n discussed above -ir- f c yy r: :~ ; Ur.-r.y -- : v 



esitcdUg e.n ( 1) ctwa.6 to tfUnU&k , :dkkange, ; i U& itttc 



we see that it has the same form, ?bupf they! difference '■•■■is - in'4%'^: : 
the fact that the English, verbs are rather mere near - synonyms? j v.' ; ' ; l: ! ' . ‘ 
than full synonyms. Again, a human user is accustomed to 




probably more borderline cases than clear-cut ones. And 
then again, a human user does not need a typographical 
indication of the distinction so badly: if he is a native 
speaker of the target language, he knows the distinction 
anyhow; if he is a speaker of the source language, he may 
make an error in his choice, but an error which will not be 
too grave, and with growing knowledge of the target language, 
he will also acquire the "feeling" for when to use one or 
another of the near- synonyms . 



This is how bilingual dictionaries, particularly the 
smaller ones, operate: they rely on the abilities and know- 
ledge of the human user. The indications of such dictionaries 



very frequently have the main purpose of triggering in the 
human user thinking and imaginative processes which make him 
recollect words and expressions not immediately indicated in 
the dictionary. We cannot rely on all these abilities when 
we construct a dictionary for mechanical use. Therefore, the 
rule should be that there should be no unspecified indication 
of synonyms as partial equivalents: if there is more than one 
partial equivalent, they should be accompanied by the neces- 
sary restrictive glosses which will show which to choose. If 



b ° t ? 1 equiv alents can real lyi ; b e us ed unres trictedly , i . e . if 
they are fully synonymous , it is possible to indicate only 
one of them (preferably the more frequently occurring one) or 
to indicate the pos sib ility of free variation , e . g . for sty- 
listic purposes . c' 

- ~ 1 . _ -its -l-sy-.-i- \ l .y. v+kM:- •« c • .ii 






•- f: vr;; 






This statement is focus sed particularly on bilingual dic- 
tionaries of living languages for general use . Large 
r»Hi i ni rtgical dictionaries of dead languages are of a 

t: tVTi P ! : til fron i i airy ;..7^ n ■ 4 W : a w a ™ 1 




2 V/ 




To prepare a dictionary which will reach this degree of 
explicitness and accuracy is an extremely difficult task. 
Moreover, X am afraid that even when all this is done, there 
will still occur situations in which the automatic device 
will not be able to make a choice. This may occur, for in - 
stance, in any text where the relevant context is not close 
to the passage which needs disambiguation. It would seem 
that in such a situation no random choice should be made 
but both (or all) possible equivalents should be printed 
in the output with a sign showing their mutual comple- 
mentarity , 



A similar but much worse situation will occur when the'' 
automatic device is faced with a neologism, i.e, with a 
genuinely new expression or with an "old" expression used 
with a new sense. To discuss this difficulty, however, is 
quite a different task, because an attempt at the solution 
of this problem would require an investigation of the re- 
gularity of new coinages. For instance, new terminological 



coinages tend to have a high degree of regularity. In any \ 
case, a discussion of these problems must be reserved for ■ 
another occasion. 



B I BL rOdRAPHY 

/ 



ARAPOV , 
1967 



M « V 



S intake iahestfay a model 1 yasykov (A syntactic model 
of language*) [with V . B, Borshchev]. Moscow; Nauka 



Avtomatisatsiya pereboda tekstov (Automatization of the trans- 
lation of texts) , Ed i tor fa 1 Int roduc't ion to a 
part iy translation of the ALPAC report Language 
and Mjtohines s Computers - Translation and- 
Lingtiis ti os . NT I , ser. 2, No, 8 . 



BACH , EMMON 



197? /"Syntax since Aspects' 1 , in'Beport of the 22nd Annual 
Bound Table Meeting, Washington, D.C.: Georgetown 
University Press. 



BAR-HILl'EL, YEHOSHUA 

196f Language and Information, 




Add i so n -Wes 1 



Read i ng , Mass 
y Publishing Co . . 



. : . . - 

"Dictionaries and mean] ng rules" 
Language 3 : 40 9 - I 4. ) : 



Foundations of 



"Measures of syntactic complexity" (with A. Kasher 
and E, Shamir], 23-50 in Machine Translation ^ A.D. 
Booth , ed . Ams terdap: North-Hoi land. New York; 
John Wiley s Sons , 



196 7c 



1 968 



"Review of The Structure of Language : Readings in 
the Philosophy of Language ;(ed.^by J . A. Fodor and 
J . J ; Katz). 11 . . Language 43 : 526-50 . 



7- ■/■ 




• 11 U n (versa 1 semanti ^s.c and • p h i losophy of 1 anguage ; 

Quanda r i es and p ros pec ts " i n Substance and Struature 
of Language , J . Puhvel ,«ed . .Berkeley and Los : Angel es, 
Ca 1 i forn i a ; • Un i ve rs i ty of Ca 1 ifornia P ress 

"Forma 1 l ogT c and inatu ra 11 anguiages-:: A sympos i urn". 
Foundations of Language 5:256-84. 

Some ref 1 ect ions/ on the present out look for high* 
quality machine transl atlon (Pos I tion paper on 



7 ; - 

■7. 




BORSHCHEV, V. B . 

1967 Lisp ozitsii ^ algoritmi i poroshdayushahiye 

procedury (Dispositions, algorithms, and generative 
procedures) [with Yu. A. Shrei der] . Moscow: Nauka. 

FI LLMORE , CHARLES 

1968a "The case for case", 1-88 In Universale in 

Linguistic Theory £ E. Bach and R. Harms, eds . . 

New York: Holt, Rinehart S Winston. 



1 968b 



1968c 



"Lexical entries for. verbs". 
Language 4 : 373 - 9 3 • 



Foundations of 



"Types 

Papers 



of lexical Information", 65-1.03 in 
in Linguistics No. 3. Columbus, Ohio: 
Ohio State University, [Also to be in Semantics 
An Interdisciplinary Reader in Philos ophy , 
Linguistics 3 Anthropology 3 and Psychology 9 
D, Steinberg and L. Jakobovits, eds.. Cambridge, 
Mass.; Cambridge University Press; and in 
Proceedings of the Balatons sab adi Conference on 
Mathematical Linguistics ,, F . Ki ef er , ed. 
Dordrecht, Holland: D.Reidel.] 



1969 



19 70 : 



'Verbs of judging: An exercise 
ion", 91-117 in Papers in 



i n 



s ema n 1 1 c 
1 : I 



deseri p ■ 



Tallahassee, Florida: Florida State University. 

"On a fully developed system of linguistic 

descr i p t 1 on" .i Au s t i n , Texas; Li ngu i s t i cs Research 

Center, The University of Texas at Austin. 



FRASER , J . BRUCE 



An Examination of the Verb-Particle Cons trucii on in 
English. P h . D . dissertation. Camb r i dge , Ma s s . : 

M . 1 . ' T . ' ••• * -vl - £.„• ’•V-'-i 



1966a 




FRAS E R , con t 1 d 

^970 11 Idioms within a transformational g r a mm a r 1 1 . 

Foundations of Language 6:22-42. 



GARVIN, PAUL L. 



I 9 66a 



Pi ‘‘edi cation Typing — A Pitot Study in Semantic 

Ana Zy ste [with J . Brewer and M . Math i ot] . 

Canoga Park, California: Bunker - Ra mo Corporation. 



19 66b 



Some comments on algorithm andl gramm'ar in the 
automatic parsing of natural languages". 
Mechanical Translation 9:2-3. '• - 



1967 

1970 



"Machine translation - fact or fancy?" Datamation . 
April. ' 



translation: a 



"Operational problems of machine „ B . Wll 

position paper". Austin, Te^cas : 'L i ngu Is t i cs 
Research Center, The University of Texas at Austin 
(appended ) . 



J0SSELS0N, HARRY H. 

1967 "Lexicography and the compute r" 1 , ?1 046- 1 059 in 
To Honor Roman Jakobson . The Hague, Holland: 



Mou ton . 



/ 



. T.I 1 e lexicon: a matrix of 1 exemes and the? r 
properties", in Proceedings of the Balatonssabadi 
Conference on Mathemaii cal Li ngui a ti as y F , K t e f e r , 
ed.. Dordrecht,.. Holland: D .1 Re t{de 1 . 



1 968b 



Research in MT: Russian to English;. 1 :.. Ten Tear 

Summary Report. Det ro i t i ch i gan : ; Wayne ; S ta te 

Un i vers ity. j\- v ; -. 



1969a 



"The lexicon: a system of matrices of lexical units 

and the i r properties", paper presented at the Inter- 
national Conference -on. Computational Linguistics. 
c * kholm, Sweden: KVAL . • / . 




KARTTUNEN , 
1967 

1 968 a 

1968 b 

1969 a 

1969 b 

1970a 

1970 b 



LAURI 

"The Identity of noun phrases". Santa Monica, 
California: The RAND Corporation. 

"What do referential indices refer to? " Santa 
Monica, California: The RAND Corporation. 

"What makes definite noun phrases definite?" 
Santa Monica, California: The RAND Corpora t?o 

"Discourse r e f e rents " , Preprint No. 70 , . I nter- 
national Confirencs on Computational Linguist! 
Stockholm, Sweden: KVAL 

"Problems of reference in syntax". Austin, Te 
The Uni vers i ty of Texas at Austin, mimeograph, 

"The logic of English predicate complement 
constructions"/ Austin, Texas: Linguistics Re 

Cente r , The Un i ve rs I ty of Texas at Au s t i n (app 

"On the s ema n 1 1 cs of comp 1 emen t sentences" , 
328-39 In Papers from the 6th Regional Meeting 
Chicago, 111? no i s : Ch I ca go L i n g u 1st i c Soc i e ty . 

"Implicative verbs". Language. 



KAY , MART IN 

1967 "Experiments with a powerful pa rs e r", paper No 




F r a n c e , Santa M on ice, 

..Co rpora.t i on . 



California:' The RAND 



1 969 



"Computations 1 comp et e ncea nd 1 in guist i c pe rfo 
Santa' Mon i ca., Cal i fornl a : The RAND Corporation 



.1:9 7.0a "The MIND system: The morplvolog i ca 1-ana lysis p 
[with G . R . Ma rti ns]. Santa Monica, Ca 1 if or , 1 1 



The RAND Corporation. 

1970b "The MIND system:; The Structure of the semant I 
file" [with S. Y. W . S u ] . Santa Monica, Calif' 
The RAND Corporation. 

1970c "Performance grammars". Santa Monica, Cali for 
The RAND Corporation. 



n . 
cs . 
xas : 

sea r ch 
ended ) 



. ) 0 
r te 



rmance" . 

rog ram" 
a: 




KULAG I NA ,0 . S , 

1969 "Eshche raz k voprosu o real I zats 1 1 avtomatl cheskogo 
perevoda (Once more on the problem of the real i sat I on 
of automatic, trans latian)" [with !. A. Mel'chuk and 
V. Yu, Rozentsveyg] , . NT I , ser, 2, No. II. 

LAKOFF. GEORGE 

1965 On the Nature of Syntaotl 0 Irregular-tty j Ph , D , 
dissertation. Bloomington, Indiana: Indiana 

University, Mathematioal Llngulstlas and Automatl a 
Trane lation Report No . NSF- 16 . Cambridge, M ass aV • 
chusetts : Harvard Computation Laboratory. 

[Published as Irregularity in Syntax, New York: 

Holt, R i neh a r t 5 Winston, 1970,] 

1968a "Deep and Surface grammar". Bloomington, Indiana: 
Indiana University Linguistics Club, mimeograph. 

1968b "Instrumental adverbs and the concept of deep 
structure". Foundations of Language 4:4-29. 

1968 c "Is deep structure necessary?" [with J, R. Ross ] . 

Bloomington, Indiana: Indiana University Linguistics 

Club, mimeograph. 

1 968d "Pronouns and reference". -> B 1 00 m ington, Indiana: 

1 M ' ana Un i vers i ty L i ngu i s t i cs C 1 ub , mi meograph . 

1969 "On derivational constraints", 117-39 In Papers 

from the 5 thi Regional Meeting a l%Chtoago Linguists a M 

3 V . ; . Sooigpfry ---' ’]CK i cag o ,;;£l 1 ■ 1 i noisji^D e pa rt men t jof 

L rn g k> i s t i cs ,‘^U n i vers i tyiof Ch icago. 

;) 9 7p "Natural log lean d lexical d ecompositlon " , 340--62 

in Papers from.y-the 6th Regional Meeting a Chleagio. 
L^ngutsilo Soel ety • C h i c a go , I 1 1 ■ i no i s : C h i ca go 

V^y^;M n 9Mlsti;c Society. 






LAKO FF ROBIN, con t 1 d 



1970 "Tense and its relation to participants". 

Language 46:838-49. 

to- appear "Questionable answers and answerable questions", 

in Papers in Linguis -bias in Honor of Henry and 
Renee Kahahe a B. Kachru, et al . , eds . 

LEHMANN , WINFRED P . 

1969 Research in Russian-English Machine Translation 

on Syntactic Leve l 3 vo 1 s , 1 s 2 [with L. W. Tosh, 

R.R. Macdonald, and M. Zarechnak] , Austin, Texas: 
Linguistics Research Cen ter , The Un i ve rs i ty of 
Texas at Austin. 



1970 Research in German-English Machine Translation on 
Syntactic Levels vols. 1 & 2 [with R. Staehowitz 

a nc * *-.• M’ Tosh] . Austin, Texas: Linguistics 

Research Center, The University of Texas a t Austin. 

LYONS , JOHN 

196? "A note on possess ive, exi stent ialV and locative 
sentences". Foundations of Language 3«390-96. 

19 70 "The f e a s lb II i ty of hi gh-qua 1 I t y machine trans- 
lation", Austin, Texas: Linguistics Research Center, 

The University of Texas at Austin (appended). 



- Me CAWL E Y , ' J am e S ■ . Y7 ^ ' " 

1968 a "Con ce rn I ng the base component of a t ransfo rmat I onal 
grammar". Foundations of Language 4:243-69. 

1968 b "Lexical. Insertion in a- transformational grammar 
. . • wi thout .deep,, s tructure",, 71 ^ 80 r i n Pap ersjf remit he 

4 th Reauonat Me etinai Chi ' 




: ; : / . 

: ... ...... ... ...... ~ 




MEY, JACOB 



1 969 


''Syntax or semantics; Some controversial issues 
In computational linguistics", Norsk Tidsskrift 

for Sprogvi denskap 23 ; 59 - 70. 


1970 


"toward a theory of computational linguistics", 
paper presented at the Annual Meeting of the 
Association for Computational Linguistics, 
Austin, Texas: Linguistics Research Center, 

The University of Texas at Austin. 


NAT 1 ONAL 


ACADEMY OF SCIENCES - NATIONAL RESEARCH COUNCIL 


1 966 


Language and Machines : Computers -in Trans lation 
and Linguistics s a report by the Automatic Language 
Processing Advisory Committee Division of Behavioral 
Sciences, Pulbication 1416, Washington, D.C. 



NIDA, EUGENE A. 



1969 


The Theory and Practice of Translation [wf th 

Charles R. Tnber], Leiden; Brill. 



PETERS, P. STANLEY, Jr, 



1969a 


"Ambiguity, completeness and restriction problems 
in the syntax-based approach to computational 
linguistics" [with R, T abo ry] , Linctuis ti as 
46:54-76, 


1 969b 


"A note on the universal base hypothesis" [with 
R. W. Ritfahie], Journal of Linguistics 5:151-52. 



PETRI CK, STANLEY R. 



19 65a 


"On the relative efficiencies of context-free 
grammar recognition" [with T. V. Grl ff 1 ths] . 
Bedford, Massachusetts : US AF Cambridge Research 

Laboratori es . >’•' • 0 ' • •••••■•'• i. :: - 


1965 b 


A Recognition Procedure for Trans fortnaiional 
Grammars. Ph.D. dissertation. Cambridge, Massachu- 
setts : M . 1 . 7 v C" ..V ,■ .f . ' f ' - :. v ‘ 


1 966 


M A program for transformational syntactic analysis 11 *. 
Bedford , Massachusetts : USAF Cambri dge Research 

Laboratories. V '"j 


196 7 


"Syn tact i c ana lys i s " . [w i th S . J . Key s e r ]. Bedford , 

Massachusetts : USAF Cambridge Research Laboratori es ,1 



PETRI CK , 

1969 



con t 1 d 

"On coordination reduction and sentence analysis" 
[with P . M. Postal and P , S, Rosenbaum], 



C ommu n i a a ti one of the ACM 1 2 : 2 2 3 - 3 3 



1971a 



"On the use of syntax-based translators for 
symbolic and algebraic manipulation", 224-37 



Proceedings of the Second Symposium on 
and Algebraic Manipulation. New York 
for Computing Machinery. 



i n 

Sy mb olio 
Association 



1971b 



'Syntactic analysis for transformational gramma r s 
Austin, Texas! Linguistics Research Center, The 
University of Texas at Austin. 



1971c 



"Syntactic analysis requirements of machine trans- 
lation". Austin, Texas: Linguistics Research Center, 

The University of Texas at Austin (appended). 



PUMPYANSKI Y 

1966 



A. L 



Upryashneniya po perevodu nauohnoy i tekhnicheskoy 
literatury (Studies on the translation of scientific 
and technical literature). Moscow: NAUKA. 



ROSS, JOHN ROBERT 
196 5 



"Under lying structures in discourse" [with T. G, 
Beverj , VI I 1/1-12 in Proceedings of the Conference 
on Computer-Related Semantic Analysis . 6et rol t , 
Mich igan : Wayne State Uni vers i ty . 



196 7 



Constraints on Variables in Syntax. Ph.D, dissertation 
Cambridge, Massachusetts : M. I. T. Bloomington, 
Indiana: Linguist! c s Club, m i meo g r a p h , 



19 70a 



"On declarative sentences", in Readings in English 
Transformational Grammar 9 R . Jacobs and P. Rosen- 
baum, eds . . Wal th am , Massachusetts : G i nn - B 1 a \ sdel 1 



1970b 



"Gapp i ng and the o rde r^i o£bns t i t gen ts " r i n 
7’” of the 10th International Ccngres'r of 

'•es, Bucharest . _ , 



SIMMONS , ROBERT ' F . 

i 960 "Storage and retrieval of 

directed graph structures" 



ACM 9:21 1-1 5 . 






ipectSf of ^mean i ng i n 
Communications of the 






573-: 



ERIC 






Pss r 



, 












■V.v 



‘ " ‘ ’ ' 



:• * ,r- 






■ ;V : : ■ 









■ rV • V-fr -: r; - f, : .v : V -;^r = , . ^ 

. •; .;f T , ... 



S 1 MMONS , 
1967 


eon t ' d 

''Answering English questions by comp u r e r " , 253-89 
in Automated Language Processing, H, Borko, ed. t 
New York: John Wiley & Sons, [Also in Communications 

of the ACM S (I 365 ) :53-70.] 


1970a 


"Generating English discourse from semantic networks" 
[with J. Slocum], Austin, Texas: Computer Assisted 

instruction Laboratory, The University of Texas ‘ a t 
Austin. 


19 70b 


"Natural language question-answering system". 
Communi cations of the ACM 13:15-31,, 



STACHOW I TZ , ROLF 



1969 


On the definition of the term "die aontinous 
cons ti iuents n in context” f ree p hr as e 8 tru.ctu.re 
gramma* , Ph.D. dissertation. Austin, Texas: 

The University of Texas at Austin, 


1970a 


"The construction of a computerized dictionary", 
paper presented at the Modern Language Association 
Lexicographical Conference, Columbus, Ohio, Austin, 
Texas; Linguistics Research. Center, The University 
of Texas at Austin. 


1970b 


"A model for the recognition and production of 
synonymous express! on s w ! th d i f f a re n t deep s t ructu res 1 1 p 
paper presented at the Conference on Linguistics at 
the University of Iowa, Iowa City. Austin, Texas : 

LI ngu 1st its Research Cen ter, Th e Uni vers i ty of T exa s 
at Austin. 


1971a 


•'Lax J cal features In translation and paraphrasing? 
an experiment "i A u s t I n , T exa s Linguistics Research 
Center , The Uni vers i ty of Texas at Austin, 


1971b 


''Requirements for Machine Trans lit ion: 

P rob 1 ems , Sol u t ions , 'Pros pe c t s " Austin , Texas : - 

The Uni ve rs i ty of Texas at Aus tin (appended) . 


to appear 


Norma li sat ion of Natural Language for information 

'Retrieval [with W. P Lehmann ] . ■ Austin . Texas : ' 

Linguistics Research Center, The University of 

Texas at Austin*' W- 1 .’ '.;i -• ■ -). *■ - 

.Development of Germah-English Machine Translation 1 

System [with W. P . Lehmann ] . Austin, Texas : Th® 'V v j 

University of Texas at Austin. s 


ERIC 


: zss :: ; | 



SWANSON , 
1 967 

1970 

WALKER , 

1 965a 

1965b 

19 66 

1 967a 
1967b 

1 969a 
19 69b 

1970 



W I NO GRAD , 
1970 




■ "V. . . 

V/=' ■rL-.i,Vv? 



ROWENA 

MOVE THE INFORMATION , • . A Kind of Missionary Sp i ri t • 
Arlington, Virginia: USAF Office of Aerospace Research 

"Trend in information handling in the United States". 
Arlington, Virginia: USAF Office of Scientific 

Research. 

DONALD E. 

■ "English preprocessor manual". Bedford, Massachusetts 
The MITRE Corporation. 

"The MITRE syntactic analysis procedure for, trans- 
formational grammars" [with A. M Zwieky, J. Friedman, 
and B . C . Ha I 1 ], 3 1 7"26 in Proae e dings of the 19 6 5 

Fait Joint Computer Con ferenae . New York and 
Washington, D.C.: Spartan. 

Recent developments in the MITRE syntactic analysis 
proced u r e" [with P. G. Chapin, M. L. Gels, and 
L. N. Gross]. Bedford, Massachusetts: The MITRE Corp. 

"On-line text processing: Introduction and overview". 

Bedford, Massachusetts: The MITRE Corporation, 



"SAFARI, an on* 1 i ne text-process i ng system", 

Vo 1 . IV, 14 4-4 7 ) n Proem e dings o f the Ameri can 
Documentation Institute, London, England: 

Academic Press. 

"Computational linguistic techniques in an on-line 
system for textual analysis". Bedford , Massachusetts : 
The MITRE Co rpo ra t i on , 



"on- 1 i ne computer a i ds for research in linguistics" 
[with L. N. Gross] , )531~36 in Information Pro- 
cessing 68^1 Amsterdam: North-Hoi 1 and. 



, "The current status.; of computer hardware and 
S o ftware a s it a f fects t h e d eve 1 op men t of high 
' quality machine translation",. Aust; in, Texas : 
Linguist! es Re s e a f c h:t;c e h t e r .Th e S u ri i vers I ty of 

Texas “a t'5 Aust i n ^ (appended) . *•''**'* - 

“ . T fs ■ . - .v v ■ •• . 

T - TRY . 



Procedure e as a Representation for Data in a 
Comput er Program : foriVndersfcm:^ hg Natural " 
Language . Ph. D .diss ertatlon, Cambr i dge , 
Massachusetts : M . I . T . [ Re v i sed version pub 1 i shed 

setts : M . I . T . , Project >; 



1 9 7 1 , C a mb r i d g e ,• Massachu setts: 

mac . ] - . • -v. vr " 






57 mi 

W 









1® 





. . / • ' ®. — ‘ :-® * .. r:\.. : ■ 

- : • - • 

— -*■- • - L - - , - •'* •: " - ’ - - - ‘ ^ -•= . - " • - - ' V- ‘ 



z GUSTA , LADI5LAV 

1970a "Equtvalents and explanations in 'Bl 1 ingual 

diet ionar ies", paper presented at the Modern 
Language Association Lexicographical Conference, 
Columbus, Ohio. Austin, Texas: Linguistics Research 

Center, The University of Texas at Austin (appended). 

1970b "The shape of the dictionary for mechanical trans- 
lation purposes". Austin, Texas: Linguistics 

Research Center, The University of Texas at Austin 
(appended) . 

to appear ManuaZ of Lexicography . Prague, Czechoslovakia: 
Academy of Sciences. 




uhc: 



ty Classification 



DOCUMENT CONTROL DATA -RAD 

c ^ aii ^^ c *^ on at titje, hc>dy of abstractand indexing a nnotatioornLiHt bo entered when the ov ^ rM ^ r ^ p0r ^^ 1 claemiliad) 



1 . origin* TINS ACTIVITY (Corporate author) 

University of Texas 
Linguistics Research Center 
Austin* Texas 78712 



2a* REPORT SECURITY CLASSIFICATION 

UNCLASSIFIED 



S5, GROUP 



If/A 



3, REPORT TITLE 



FEASIBILITY STUDY OK FULLY AUTOMATIC HIGH QUALITY TRANSLATION 
Volume II 



4. DESCRIPTIVE NOTES (Type o (report and inclusive dates) 

Final Report 1 February 1970 - 30 June 1971 

3. A U THOBIS) (First name, middle Initial, last name) 

Dr, Winifred P, Lehmann 
Dr. Rolf Staehewitz 



REPORT DATE 

December 1971 

SB. CONTRACT OR GRANT NO, 

F30602— 70-C— 0129 
Job Order No, 45940000 



7 a. TOTAL NO. OF PAGES 
252 



7b. NO. OF REFS 

108 



«a, ORIGINATOR'S REPORT NUhABERISI 

None 



3b. other REPORT NO<si (Any other numbmre that may be assigned 

RADC-TR-71-295, Volume XI (of two) 



10, DISTRIBUTION STATEMENT 

Approved for public release; distribution 


unlimited. 


11. SUPPLEMENTARY NOTES 

None 

13. ABSTRACT - . .. ' ~ * 


12. SPONSORING MILITARY ACTIVITY 

Rome Air Development Center (IRET) 

Griff iss Air Force Base, New York 13440 




This report presents the results of a theoretical inquiry into the feasibility 
of a fully automatic high quality translation (FAHQT), according to Bar-Hillel's 
definition of this term. The purpose of this inquiry consisted in determining the 
viability of the FAHQT concept in the light of previous and projected advances in 
linguist ic. theory and software/hardware capabilities . The corollary (purpose was to 
determine whether this cpnee^t can; ibe taken into consideration a a legitimate and 
justifiable objective of R&D. The effort was supported by 20 expert consultants 
from the various universities and research centers in the U^S.A. and abroad . Con- 
clusions and recommendations are presented on pages 44-50 of the report . Individual 
cbntritmtions of participants and consultants, reflect a wide range of Opinions 
concerning the prospects of FAHQT in intermediate and long range of R&D . 







1 "““••14 73 






f 







Linguistic Theory 
Comput at i onal Linguistics 
Machine Translation R&D 
Research in Syntax/Semantics 
Lexicography in Machine Translation 
Sort ware /Hardware Capabilities 



mo u K 



mo u e 




